Description (7UT)
DataGenie is a tab plug-in for Protege that allows Protege to read data from an arbitrary database. You can either use JDBC or JDBC-ODBC bridge to connect to your database and move portions (or all) of your database into Protege. Generally, each table becomes a class, each row becomes an instance, and each attribute becomes a slot. In addition, if a relational database table has foreign key references to other tables, these can be replaced by Protege instance pointers when the database is converted into a knowledge base. (7UR)
This plug-in is NOT a database back-end. The typical use-case for this plug-in is importing legacy data into Protege before doing additional knowledge acquisition or knowledge modeling. This plug-in (as written) does not include any capability for moving data in the opposite direction, i.e., from Protege classes and instances into a relational database. Another use-case for this plug-in might be as a database viewer. For efficiency, a database might be stored as a set of custom-designed database tables, but then DataGenie could be used to view portions of the schema from within Protege user interface. (7US)
Please note: If you are using Protege-OWL, we recommend that you use the DataMaster plug-in instead. DataMaster supports both OWL and frame-based ontologies, whereas DataGenie was developed prior to the existence of OWL. (9CA)
Installation (7UU)
You have two options for installing DataGenie: (7UV)
Option 1 - Version 1.1 of the DataGenie tab is bundled with the full installation of Protege. If you have the full version of Protege installed on your machine, choose the Project -> Configure menu item and select DataGenieTab to enable DataGenie. (7UW)
Option 2 - Version 2.0.1 is available for download from this Wiki. The Stanford Protege Team decided not to bundle this newer version of DataGenie with the Protege installation because of some unresolved bugs. Despite the buginess of this newer version, it is still useful (see the documentation section for a list of new features in the 2.0 series). Please note that this version was compiled against Protege 3.2 and JDK 1.5. (7UX)
To install version 2.0.1: (7UY)
- Delete version 1.1 from your current Protege installation. You can do this by deleting the folder entitled "edu.washington.datagenie" in your <protege-install-dir>/plugins directory. (7UZ)
- Unzip the contents of the installation ZIP file into your <protege-install-dir>/plugins directory. Make sure to preserve the path information in the ZIP file so that the proper subdirectory under the plugins directory is created. (7V0)
Example project - If you are using version 1.1, which is bundled with the full version of Protege, you already have an example project and database in the following directory: <protege-install-dir>/plugins/edu.washington.datagenie/examples/. If you are using version 2.0.1, you can download the example project from this Wiki. (7V1)
Documentation (7V2)
New Features in version 2.0.1: (7V3)
- JDBC support - Enter a JDBC Driver and URL and DataGenie will connect to the database. When tested with PostgreSQL, it was found that the JDBC driver had more SQL functionality than the JDBC-ODBC bridge. (7VE)
- Link to the PostgreSQL JDBC driver: http://jdbc.postgresql.org/ (7VC)
- Link to the MySQL JDBC driver: http://www.mysql.com/products/connector/j/ (7VD)
- Automatic processing of foreign keys into Protege instances - If table T1 has a foreign key relationship to table T2, then a new slot is added to class C1 (representing T1) whose value is an instance of C2 (representing T2). Each instance of C1 will look up the corresponding instance of C2 and add it to the new slot. (7V8)
- Automatic handling of bridge tables - If table T1 has exactly two columns, both columns are part of the primary key, and both columns are foreign keys to other tables T2 and T3, this is defined to be a bridge table. In this case, a slot with cardinality multiple will be added to C2 (representing T2), whose values are instances of C3 (representing T3). A corresponding inverse slot will be added to C3 whose values are instances of C2. Each instance of row in T1 will cause an instance of C2 to be added to the new slot in T3, and an instance of C3 to be added to the new slot in T2. (7VA)
Microsoft Excel Spreadsheets: (8ZE)
The DataGenie tab will only import Excel spreadsheets if the spreadsheet contains tables. See the following thread in the mailing list archives for more detail: (9CB)
http://article.gmane.org/gmane.comp.misc.ontology.protege.general/7736. (9CC)
If your spreadsheet does not contain a table, you can convert it to a Microsoft Access database and then go through the import process. (8ZF)
Screenshots (7VG)
DataGenie connected via an ODBC data source: (7VH)
DataGenie connected via a JDBC data source: (7VJ)
Authors (7VL)
Authors: John Gennari, My Nguyen, Adam Silberfein (7VM)
Institutions: University of Washington, Stanford University (7VN)
Level of Support (7VO)
The original authors of this plug-in from the University of Washington no longer support DataGenie. (7VQ)
Since the Protege community continues to express mild interest in this plug-in's functionality, the Protege Team set up this Wiki page for reference and moved the source code for this plug-in to our Wiki. If you have a question about how to use the tab, please send email to the protege-discussion mailing list. (7VP)
For the purposes of having a complete historical record of the development of DataGenie, to follow is a link to the OUTDATED page at the University of Washington from when they were still maintaining the plug-in: http://faculty.washington.edu/gennari/Protege-plugins/DataGenie/index.html. (7VV)
License (7VR)
DataGenie and its source code (like Protege) are freely available under the open source Mozilla Public License. (7VS)
Source code for version 1.1 (ZIP file format): http://protege.cim3.net/file/work/files/DataGenie/version-1.1/datagenie-src-1.1.zip. (7VT)
Source code for version 2.0.1 (ZIP file format): http://protege.cim3.net/file/work/files/DataGenie/version-2.0.1/datagenie-src-2.0.1.zip. (7VU)