Monday, March 24, 2014

Big Data at the Nigerian Movies

Using Big Data and Natural Language Processing techniques to query the Global Graph for Nigerian Movies and Actors.

Using Virtuoso SPARQL editor, I sent the following queries to the LinkedMDB SPARQL Endpoint using Federated SPARQL.

Nigerian Movies on LinkedMDB

The following query lists all the Nigerian movies in the RDF store of LinkedMDB or in SPARQL terms, "show me movies whose 'movie:country' is 'country:NG' which may have a 'dc:title' or a 'dc:date'" :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>

 select ?s ?Title ?Date

 where {
 SERVICE <http://data.linkedmdb.org/sparql> {
 ?s movie:country   country:NG .
 optional { ?s dc:title ?Title. }
 optional { ?s dc:date ?Date. }
  }

 } 
 
[View query in Virtuoso SPARQL Editor]

The answer looks like this.

 ---------------------------------------------------------------------------------------------------
 | S                                               | Title                          | Date         |
 ===================================================================================================
 | <http://data.linkedmdb.org/resource/film/10659> | "Rise and Fall of Idi Amin"    | "1980"       |
 | <http://data.linkedmdb.org/resource/film/16051> | "Osuofia in London"            |              |
 | <http://data.linkedmdb.org/resource/film/16118> | "Ezra"                         | "2007-01-23" |
 | <http://data.linkedmdb.org/resource/film/16130> | "Aki na Ukwa"                  |              |
 | <http://data.linkedmdb.org/resource/film/16131> | "2 Rats"                       | "2003"       |
 | <http://data.linkedmdb.org/resource/film/16463> | "Above Death: In God We Trust" | "2003"       |
 | <http://data.linkedmdb.org/resource/film/17663> | "Abuja Connection"             |              |
 | <http://data.linkedmdb.org/resource/film/24122> | "Missing Angel"                |              |
 ---------------------------------------------------------------------------------------------------

 
[Run the query using URIburner Endpoint].

Wow! they have only 8 Nigerian movies in their dataset.

Who acted in the movie "2 Rats"

Let us find out who acted in the movie "2 Rats" by using the following query. In SPARQL terms "show me the 'movie:actor' of '.../film/16131' and also tell me his '?Name'" :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>

 select ?Name

 where {
 SERVICE <http://data.linkedmdb.org/sparql> {
 <http://data.linkedmdb.org/resource/film/16131>  movie:actor ?actor .
 ?actor movie:actor_name ?Name .

  }

 } 
 
[View query in Virtuoso SPARQL Editor].

The answer .

 --------
 | Name |
 ========
 --------
 
[Run the query using URIburner Endpoint].

Wow! LinkedMDB does not have that information. Let me see what else they have on the movie using the following query or in SPARQL terms "show me all you have on the subject of '.../film/16131'" :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbowl: <http://dbpedia.org/ontology/>

 select *

 where {
 SERVICE <http://data.linkedmdb.org/sparql> {
 <http://data.linkedmdb.org/resource/film/16131> ?PropertyName ?Value
  

  }

 } 


 
[View query in Virtuoso SPARQL Editor]

The answer.

 ----------------------------------------------------------------------------------------------------------------------------
 | property                                          | object                                                               |
 ============================================================================================================================
 | <http://xmlns.com/foaf/0.1/based_near>            | <http://sws.geonames.org/2328926/>                                   |
 | <http://xmlns.com/foaf/0.1/page>                  | <http://www.freebase.com/view/guid/9202a8c04000641f8000000005674af3> |
 | movie:initial_release_date                        | "2003"                                                               |
 | dc:date                                           | "2003"                                                               |
 | movie:country                                     | country:NG                                                           |
 | movie:language                                    | <http://www.lingvoj.org/lingvo/en>                                   |
 | movie:filmid                                      | "16131"^^<http://www.w3.org/2001/XMLSchema#int>                      |
 | dc:title                                          | "2 Rats"                                                             |
 | rdfs:label                                        | "2 Rats"                                                             |
 | <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | movie:film                                                           |
 ----------------------------------------------------------------------------------------------------------------------------

 
[Run the query using URIburner Endpoint]

This just shows the release date, country, lanaguage, title of the movie. This is not much. Let us check from the very nucleus of the Linked Open Data web, DBpedia (http://dbpedia.org), if they have some more data than LinkedMDB.

"2 Rats" at DBpedia

Let us see if we can catch "2 Rats" in DBpedia by using the following query or in SPARQL terms "show us the ?moviename and ?actor of any dbpedia:Film that has '2 Rats' in its name and dbprop:country is 'nigeria' " :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>

 select ?s as ?Movie ?moviename as ?Name ?actor as ?Actor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.

 ?s dbpedia:starring ?dbname .
 ?s rdfs:label ?moviename.
 ?dbname dbprop:name ?actor.
 filter regex(?moviename,"^2 Rats") .
 filter regex(?country,"^nigeria","i") .


  }

 } 
 
[View query in Virtuoso SPARQL Editor]

The answer .

 -------------------------------------------------------------------------------
 | s                                    | moviename   | actor                  |
 ===============================================================================
 | <http://dbpedia.org/resource/2_Rats> | "2 Rats"@en | "Osita Iheme"@en       |
 | <http://dbpedia.org/resource/2_Rats> | "2 Rats"@en | "Iheme, Osita"@en      |
 | <http://dbpedia.org/resource/2_Rats> | "2 Rats"@en | "Chinedu Ikedieze"@en  |
 | <http://dbpedia.org/resource/2_Rats> | "2 Rats"@en | "Ikedieze, Chinedu"@en |
 -------------------------------------------------------------------------------
 
[Run the query using URIburner Endpoint]

Now we can see that DBpedia has the actors name.

What else do they have on the movie

Let us check for the movie description, director, distributor with the following query. In SPARQL terms "show me the ?director and ?distributor of any dbpedia:Film that has '2 Rats' in its name and dbprop:country is 'nigeria'" :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>

 select distinct ?s as ?Movie ?director ?distributor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.

 ?s dbpedia:starring ?dbname .
 ?s rdfs:label ?moviename.
 ?s rdfs:comment ?description .
 ?s dbprop:director ?director.
 ?s dbprop:distributor ?distributor.
 ?dbname dbprop:name ?actor.
 filter regex(?moviename,"^2 Rats") .
 filter regex(?country,"^nigeria","i") .


  }

 } 
 
[View query in Virtuoso SPARQL Editor]

and the result shows

 -----------------------------------------------------------------------------------------------
 | s                                    | director         | distributor                       |
 ===============================================================================================
 | <http://dbpedia.org/resource/2_Rats> | "Andy Chukwu"@en | "OJ Productions International"@en |
 -----------------------------------------------------------------------------------------------
 
[Run the query using URIburner Endpoint]

so we can now see that DBpedia has much more information on this movie. We can get more information like the movie description with the following query.

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>

 select distinct ?description as ?Summary

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.

 ?s dbpedia:starring ?dbname .
 ?s rdfs:label ?moviename.
 ?s rdfs:comment ?description .
 ?s dbprop:director ?director.
 ?s dbprop:distributor ?distributor.
 ?dbname dbprop:name ?actor.
 filter regex(?moviename,"^2 Rats") .
 filter regex(?country,"^nigeria","i") .


  }

 } 
 
[View query in Virtuoso SPARQL Editor]

and it showed....


 | "2 Rats is a 2003 Nigeria film. Nollywood's highest paid actors, Osita Iheme (A-boy) and Chinedu Ikedieze (Bobo) are two young boys whose father has been murdered by their uncle. In a selfish move, Amaechi Muonagor wants them to work as house boys in their father's own house. A-boy and Bobo have other plans. The film features performances by Aki na Pawpaw and can be dubbed as Nollywood's Home Alone."@en |
 
[Run the query using URIburner Endpoint]

Chinedu Ikedieze's filmography

Which other movie did Chinedu Ikedieze star in? Let us ask DBpedia the following query or in SPARQL terms "Show us any 'dbpedia:Film' 'dbpedia:starring' 'dbr:Chinedu_Ikedieze', giving us the '?director' and '?distributor' if available " :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbr: <http://dbpedia.org/resource/>

 select distinct ?s as ?Movie ?director ?distributor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbpedia:starring dbr:Chinedu_Ikedieze .
 optional {?s dbprop:director ?director. }
 optional {?s dbprop:distributor ?distributor. }

  }

 } 

 
[View query in Virtuoso SPARQL Editor]

and the result showed ...

 ---------------------------------------------------------------------------------------
 | s                  | director                   | distributor                       |
 =======================================================================================
 | dbr:2_Rats         | "Andy Chukwu"@en           | "OJ Productions International"@en |
 | dbr:Aki_na_Ukwa    | "Amayo Uzo Philips"@en     | "Kas-Vid International"@en        |
 | dbr:Games_Men_Play | dbr:Lancelot_Oduwa_Imasuen |                                   |
 ---------------------------------------------------------------------------------------
 
[Run the query using URIburner Endpoint]

Hmm! only 3 films.

Chinedu Ikedieze and Osita Iheme

Which films in DBpedia have starred both Chinedu and Osita together? Let us ask with the following query :

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbr: <http://dbpedia.org/resource/>

 select distinct ?s as ?Movie ?director ?distributor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbpedia:starring dbr:Chinedu_Ikedieze .
 ?s dbpedia:starring dbr:Osita_Iheme .
 optional {?s dbprop:director ?director. }
 optional {?s dbprop:distributor ?distributor. }

  }

 } 
 
[View query in Virtuoso SPARQL Editor]

The answer.

 --------------------------------------------------------------------------------
 | s               | director               | distributor                       |
 ================================================================================
 | dbr:2_Rats      | "Andy Chukwu"@en       | "OJ Productions International"@en |
 | dbr:Aki_na_Ukwa | "Amayo Uzo Philips"@en | "Kas-Vid International"@en        |
 --------------------------------------------------------------------------------
 
[Run the query using URIburner Endpoint]

Just only 2 movies.

Nollywood in DBpedia

Let us ask DBpedia which Nigerian movies does it have, using the following query or in SPARQL terms "show us all the dbpedia:Film you have that dbprop:country is 'nigeria' and give it to me in english language".

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbr: <http://dbpedia.org/resource/>

 select distinct ?s as ?Movie ?moviename as ?Name ?country as ?Country

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.
 optional {?s rdfs:label ?moviename. }

 filter regex(?country,"^nigeria","i") .
 filter (lang(?moviename) ='en') .
  }

 } 

 
[View query in Virtuoso SPARQL Editor]

The answer.

 -----------------------------------------------------------------------------------------------------------------------
 | s                                                             | country      | moviename                            |
 =======================================================================================================================
 | <http://dbpedia.org/resource/Area_Boys_(film)>                | "Nigeria"@en | "Area Boys (film)"@en                |
 | dbr:Lilies_of_the_Ghetto                                      | "Nigeria"@en | "Lilies of the Ghetto"@en            |
 | dbr:Living_in_Bondage                                         | "Nigeria"@en | "Living in Bondage"@en               |
 | dbr:Mirror_Boy                                                | "Nigeria"@en | "Mirror Boy"@en                      |
 | dbr:Oni_Ise_Owo                                               | "Nigeria"@en | "Oni Ise Owo"@en                     |
 | <http://dbpedia.org/resource/Rag_Tag_(film)>                  | "Nigeria"@en | "Rag Tag (film)"@en                  |
 | dbr:Rising_Moon                                               | "Nigeria"@en | "Rising Moon"@en                     |
 | dbr:Tango_with_Me                                             | "Nigeria"@en | "Tango with Me"@en                   |
 | dbr:The_Mayors                                                | "Nigeria"@en | "The Mayors"@en                      |
 | dbr:2_Rats                                                    | "Nigeria"@en | "2 Rats"@en                          |
 | <http://dbpedia.org/resource/30_Days_(2006_film)>             | "Nigeria"@en | "30 Days (2006 film)"@en             |
 | dbr:Games_Men_Play                                            | "Nigeria"@en | "Games Men Play"@en                  |
 | dbr:The_Figurine                                              | "Nigeria"@en | "The Figurine"@en                    |
 | dbr:Jenifa                                                    | "Nigeria"@en | "Jenifa"@en                          |
 | <http://dbpedia.org/resource/Above_Death:_In_God_We_Trust>    | "Nigeria"@en | "Above Death: In God We Trust"@en    |
 | dbr:Abuja_Connection                                          | "Nigeria"@en | "Abuja Connection"@en                |
 | dbr:Aki_na_Ukwa                                               | "Nigeria"@en | "Aki na Ukwa"@en                     |
 | <http://dbpedia.org/resource/Anchor_Baby_(film)>              | "Nigeria"@en | "Anchor Baby (film)"@en              |
 | <http://dbpedia.org/resource/Black_Gold_(2011_Nigerian_film)> | "Nigeria"@en | "Black Gold (2011 Nigerian film)"@en |
 | <http://dbpedia.org/resource/Ezra_(film)>                     | "Nigeria"@en | "Ezra (film)"@en                     |
 | dbr:Missing_Angel                                             | "Nigeria"@en | "Missing Angel"@en                   |
 | dbr:Osuofia_in_London                                         | "Nigeria"@en | "Osuofia in London"@en               |
 -----------------------------------------------------------------------------------------------------------------------
 
[Run the query using URIburner Endpoint]

Wow! just only 21 movies.

Conclusion

The web of linked data or Semantic Web is a global information space consisting of inter-linked data. SPARQL enables applications to query this global graph or database for information about resources or entities.
.
I have shown how easy to query these graphs for information on Nollywood. We have also seen that not enough information on Nollywood exist in the global graph so there is need to publish more.

Why? Applications can consume the results of these kind of queries to create a rich experience for users in interacting with these resources e.g. movie recommendation app, an actor app, etc.

  Emeka Okoye

Keywords: linkedopendatang, linkedopendata, opendatang, linkeddata, lod, nigeria, sparql, films, movies, nollywood

References

http://www.linkedmdb.org/ - LinkedMDB.

http://dbpedia.org - DBpedia.

http://en.wikipedia.org/wiki/SPARQL - About SPARQL.

http://www.w3.org/TR/rdf-sparql-query/ - SPARQL Query Language for RDF.

http://semanticweb.org/wiki/SPARQL_endpoint - SPARQL Endpoint.

http://www.w3.org/TR/sparql11-federated-query/ - Federated SPARQL.

http://linkeddata.uriburner.com/sparql/ - URIBurner SPARQL Endpoint.

http://virtuoso.openlinksw.com/ - Openlink Virtuoso.

No comments:

Post a Comment