Lexical Tools

Using Other Database - MySql

This page provides a general guideline of how to integrate and configure lexical tools with other database, such as MySql. An open source database, MySql, is illustrated as an example. Users may skip this page if they use the installation script to install their Lexical tools with the default database (HSqlDb). We recommend using the default embedded HSqlDb unless using other database is a must in your project. The option of MySql installation scripts is not provided in the Lexical Tools distribution after 2012 release.

  1. Install Lexical Tools
  2. Database Setup
  3. Load data
  4. Configure in Lexical Tools
  5. Issues


1. Install Lexical Tools

Follow the instructions on Lexical Tools Installation Instructions to complete the installation of lexical tools.

Installation

2. Database Setup

2.1 Requirements

A running database with following specifications is needed to run lexical tools with this option. Please consult to your database administrator if you have questions.

  • A existing database with enough space
  • Root account and password of database
  • Hostname and port number of database
  • JDBC driver for the database

2.2 Procedures

  • Create a user account for lvg
  • Create a database for lvg2024

Consult with your DBA if you have any questions

2.3 Example: MySql

MySql is one of the most popular open source databases. We use it as an example for this option. Bellows are required information:

  • DB: MySql V.4.1.12
    Please refer to the MySql web site to download software and follow the documents to install MySql
  • JDBC: Download mysql-connector-java-3.1.10-bin.jar from web and put it to ${LVG_DIR}/lib/jdbcDrivers/MySql/
  • root account/password: root/root_password
  • Hostname/port number: localhost/3306(default)

Also create the user account and database for lvg:

Create a user account:

  • username: lvg
  • password: lvg

Create a database:

  • database name: lvg2024

2.4 Sample Script:

You might need your DBA to help you on setting up user account and database for lvg. The program and shell scripts of HSqlDb provided with lvg package can be used as reference for this step. However, you need to modify the program and script and create your own. Once you created and modified your programs in ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql, you need to

  • Edit file ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql/db.cfg and modify values of
    • DBA_USERNAME
    • DBA_PASSWORD
    • LVG_DIR
  • Edit file ${LVG_DIR}/loadDb/bin/1.SetupMySql and modify values of
    • JDBC
  • shell>cd ${LVG_DIR}/loadDb/bin/
  • shell>1.SetupMySql <year>

Also, the programs and scripts for MySql was provided in Lvg.2012- and can be used as another reference.

Up to this point, users should have:

  • installed the MySql database,
  • created a database in it called lvg2024
  • opened an account with lvg as both username and password

3. Load Data

3.1 DB tables and data files

There are 9 major tables to be created and loaded into database. Each of the tables below needs to be created and loaded. The directory for these data files is: ${LVG_DIR}/data/tables/*

Table NameSource Data FileTotal RecordsCreate and Load TableCheck data
Inflection${LVG_DIR}/data/tables/inflection.data1,918,987LoadInflectionCheckInflection
Synonym${LVG_DIR}/data/tables/synonyms.data281,574LoadSynonymCheckSynonym
Acronym${LVG_DIR}/data/tables/acronym.data296,855LoadAcronymCheckAcronym
ProperNoun${LVG_DIR}/data/tables/properNoun.data18,718LoadProperNounCheckProperNoun
Derivation${LVG_DIR}/data/tables/derivation.data157,387LoadDerivationCheckDerivation
Antonyms${LVG_DIR}/data/tables/antonyms.data9,070LoadAntonymCheckAntonym
Nominalization${LVG_DIR}/data/tables/nominalization.data35,682LoadNominalizationCheckNominalization
Canonical${LVG_DIR}/data/tables/canonical.data1,897,973LoadCanonCheckCanon
Fruitful${LVG_DIR}/data/tables/fruitful.data6,401,636LoadFruitfulCheckFruitful
AntiNorm${LVG_DIR}/data/tables/antiNorm.data2,466,362LoadAntiNormCheckAntiNorm

Please notes that the last two columns are names of Java programs used to create DB tables, load data into database, and check loaded table. Users may find source codes for loading and checking data in HSqlDb are under ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/HSqlDb. These Java source codes can be used for HSqlDb (or other database) with minor modifications.
Both ${LVG_DIR}/loadDb/lib/lvgLoadDb.jar and ${LVG_DIR}/lib/lvg${YEAR}.jar contain all Java classes for these files.

3.2 Procedures

  • Download JDBC driver and put it to ${LVG_DIR}/lib/jdbcDrivers/NameOfDatabase

  • Create tables and load data:
    As mentioned above, users need to create 9 Db tables and load data into these tables. Please refer to:
    • Lexical Tools database pages for details on all DB tables
    • ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql for Java codes of creating Db tables and loading data

  • Verify database (optional):
    It is a good idea for users to verify database after data are loaded into these DB tables.

3.3 Example: MySql

This section describes procedures of loading data into 9 DB tables in MySql database as follows:

  • Download MySql JDBC driver: mysql-connector-java-3.1.10-bin.jar and put it to ${LVG_DIR}/lib/jdbcDrivers/MySql
  • Run scripts, LoadXxx, to
    • Delete table Xxx if the table already existed in the database
    • Create a new table named Xxx in the database
    • Load data from a flat file (Xxx.data) into table Xxx
    • Checks the total number of inserted records
  • Run scripts, CheckXxx, to
    • Verify database tables (optional)

This paragraph describes steps to run java programs of LoadXxx and CheckXxx.

  • Set all variables in the database config file (${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql/db.cfg).
  • Add ${LVG_DIR}/loadDb/lib/lvgLoadDb.jar to classpath.
  • Add ${LVG_DIR}/lib/jdbcDrivers/MySql/mysql-connector-java-3.1.10-bin.jar to classpath.
  • cd ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql/
  • java -classpath $CLASSPATH gov.nih.nlm.nls.lvg.loadDb.MySql.LoadXXX

Users may run CheckXxx (similar setup steps as above) to check and perform queries to table Xxx to make sure all tables are created and loaded correctly.

3.4 Sample Scripts

There are two scripts are provided to ease the processes of loading and checking data in the database:

  • Edit file ${LVG_DIR}/loadDb/bin/2.LoadDb and modify values of
    • LVG_DIR
    • JDBC
  • shell>cd ${LVG_DIR}/loadDb/bin/
  • shell> 2.LoadDb <year>
    2
    3
    

  • Edit file ${LVG_DIR}/loadDb/bin/3.TestDb and modify values of
    • LVG_DIR
    • JDBC
  • shell>cd ${LVG_DIR}/loadDb/bin/
  • shell> 3.TestDb
    
    

4. Configure Lexical Tools

4.1 Configuration file

The default configuration file is at ${LVG_DIR}/data/config/lvg.properties. You will need to modify the configuration file to use

  • different database
  • different database port number
  • different JDBC connector
Please refer to Configuration setup page for details.

4.2 JDBC Driver

Add your JDBC driver to CLASSPATH

4.3 Example: MySql

Use MySql as example, there are two way to set up in the configuration, as described bellows:

  1. Method 1:
    • DB_TYPE=MYSQL
    • DB_DRIVER=com.mysql.jdbc.Driver
    • DB_NAME=lvg2024
    • DB_HOST=localhost
    • DB_USERNAME=lvg
    • DB_PASSWORD=lvg

    or

  2. Method 2:
    • DB_TYPE=OTHER
    • DB_DRIVER=com.mysql.jdbc.Driver
    • JDBC_URL=jdbc:mysql://localhost/lvg2024
    • DB_USERNAME=lvg
    • DB_PASSWORD=lvg

In addition, users need to add JDBC driver to classpath. For examples, add ${LVG_DIR}/lib/jdbcDrivers/MySql/mysql-connector-java-3.1.8-bin.jar to ${LVG_DIR}/bin/norm if you want to run norm.

Now, Lexical Tools are ready to run with MySql database

4.4 JDBC & ports:

  • Use different port number in MySql:
    The default port of MySql Db is 3306. In case that different port number (for example, 3307) is used in your MySql database, you will need to add the port number after the database host (localhost:3307).

  • Use different JDBC connector:
    Users may use any workable JDBC driver for their database. Bellows are the quick summary of using different JDBC connector.
    • Download JDBC jar file (mysql-connector-java-3.1.10-bin.jar)
    • Add the JDBC jar file in the CLASSPATH (${LVG_DIR}/lib/jdbcDrivers/MySql/mysql-connector-java-3.1.10-bin.jar)
    • Change the Driver class name (com.mysql.jdbc.Driver) in the lvg configuration file.

5. Issues

  • MySql & UTF-8
    In our test, MySql 4.1 along with MySql JDBC 3.1.8 are not 100% compatible with UTF-8. Some problems caused by prepareStatement and table creation. For the same reason, we expect similar problems might happen in other database server. Accordingly, we suggest users use Lexical tools with its default DB, HyperSonic DB.

  • MySql in PC/Win platform
    In past experience, it took more than 20 hours to load lvg data into MySql DB in PC-win platform (4.1). It only takes less than 30 min. to load lvg data into MySql DB in Linux and Unix. However, the performance on lvg operation speed is at the same level of magnitude.