AngularJS and the hierarchy widget

This week I have started to learn about and use AngularJS.

AngularJS is an open source javascript framework. It is meant to give MVC capabilities to browser applications. AngularJS is maintained by google and is meant to make development and testing easier.

I will be using AngularJS in one of the final phases of the project to build a widget for a hierarchy based concept search. So here are the thoughts on the widget:

The search widget will search through the concepts with a typical ajax auto complete. Selecting a concept will 1) set the value of the auto complete to be the selected concept, 2) if the concept has a mapping to SNOMED CT, then show the SNOMED hierarchy around that concept in the tree below. In the tree, each item is a SNOMED CT ConceptReferenceTerm. Some of these have OpenMRS Concepts mapped to them. If they do, we should visually highlight it, and also display the preferred name of that Concept. We will Show one level of parents and children. If you click on a parent or child, it should redraw the tree for that term. (I.E. the user can traverse up or down by one step by clicking a row). Also, if the term you clicked on has one or more Concepts attached to it (query the API for concepts with a mapping SAME-AS SNOMED CT 123456), then visually highlight this row, and also show the OpenMRS Concepts using thier preferred names. These concept names should actually be links/buttons that set the search box above (and its form field) to that concept. Implement it all as a fragment called “chooseConceptByHierarchy”, that requires a “formFieldName” parameter, and ultimately sets a hidden input (with that form field name) to the conceptId of the chosen concept.

Once this is finished we should be pretty much done with all functionality for the project and it will just be time to clean up and make sure all the final touches are done.

Learning to Use Lucene

http://lucene.apache.org/

Lucene is an open-source Java full-text search library. It makes it easy to add search functionality to an application or website. After some studying about the best way to handle the work I started last week I found that this is the best tool to use for going through the SNOMED CT files.

The files were made to be put straight into a database. Because we do not plan on doing that, indexing seems to be the next best thing. Basically what Lucene gave me is the ability to take the files and index them so that I could easily search for the information I needed for my reference terms and cut down the time it was taking to process the information in the SNOMED files. So for each line in the file I split it on the tab delimiters and told lucene what each column is so that I could search on the data contained in that column. So for instance in the sct2_Relationship_Full_INT_20130131 it has sourceId and destinationId I need to search on. So, I put that into a StringField:

doc.add(new StringField(“sourceId”, fileFields[4], Field.Store.YES));
doc.add(new StringField(“destinationId”, fileFields[5], Field.Store.YES));
This makes it so that I can tell it I want to find termId “1234” in the sourceId field.

What the structure looks like:
Index –> Document1, Document2, etc… –>Field1, Field2, etc…

Here is a great description of what it is and how to use it http://www.darksleep.com/lucene/
Here is a tutorial with code: http://www.lucenetutorial.com/

Lucene was not hard to learn to use. But it is only an API all of the hard part of indexing is done but it is up to the user to figure out how to parse the file data into Lucene and how you want to get the data out which has to do with how you put the data into the Documents.

Lucene managed to cut the search processing time and made it more than 10 times faster.

More Snomed Files

This week I have been working on the following:

a dashboard-style page where you (1) indicate where on your filesystem the snomed files are, (2) have a series of buttons for the different tasks, like “add names to all my SNOMED terms” and “import ancestors for all my SNOMED terms”.

Each of those tasks should actually function as a task that runs in the background after you kick it off, and it shows you progress, along with a stop/cancel option. (“Progress” might be “what % of the rows of this file have I dealt with,” or “what % of my reference terms have I looked at”.)
Something like this:

Inline image 1

I have mostly been working on the ancestors and parents which will be what we need for the hierarchy. So I have about 38000 terms in the database and I am traversing through a file with 4,000,000 records to find the ancestors. I think the best way to handle this will be to use a HashMap with byte offsets so that it will not take too long for the process. I have made the skeleton of the work and I am just tweaking it to make sure it is as fast as possible. But I do not see any issues with getting the work done.

Midterm progress and SNOMED Subsets

This week has been focused on creating a presentation to share the progress so far and trying to determine the correct subset of data to use for the concept_reference_term table.

For the subset it looks like we will be starting from existing subsets (CIEL, CMT, CORE). and then we need determine a list of SNOMED codes based on the subset chosen, and just load those up. CIEL seems to only include mid-level terms that we’ll need to display a useful hierarchy for a concept. That means we need to include all the parents of a term as well. Then the hierarchy can be populated whether or not there is a CIEL concept currently in the database for it. This seems to be where we are with the subsets now.

For the midterm progress we are releasing stage one and we are 50% done with stage 2. We have not begun stage 3 but as you can see we are 1/2 way done at the mid point so we are right on track.

About SNOMED

This week the focus has been on getting reference terms populated into the database by uploading a SNOMED file.

Here is some information about SNOMED (taken from wikipedia):

SNOMED CT[a] or SNOMED Clinical Terms[2] is a systematically organized computer processable collection of medical terms providing codes, terms, synonyms and definitions used in clinical documentation and reporting. SNOMED CT is considered to be the most comprehensive, multilingual clinical healthcare terminology in the world.[3][non-primary source needed] The primary purpose of SNOMED CT is to encode the meanings that are used in health information and to support the effective clinical recording of data with the aim of improving patient care. SNOMED CT provides the core general terminology for electronic health records. SNOMED CT comprehensive coverage includes: clinical findings, symptoms, diagnoses, procedures, body structures, organisms and other etiologies, substances, pharmaceuticals, devices and specimen.[3]

SNOMED CT provides for consistent information interchange and is fundamental to an interoperable electronic health record. It allows a consistent way to index, store, retrieve, and aggregate clinical data across specialties and sites of care. It also helps in organizing the content of electronic health records systems by reducing the variability in the way data is captured, encoded and used for clinical care of patients and research.[4] SNOMED CT can be used to record clinical details of individuals in the electronic patient records. It also provides the user with a number of linkages to clinical care pathways, shared care plans and other knowledge resources, in order to facilitate informed decision-making and support long term patient care. The availability of free automatic coding tools and services, which can return a ranked list of SNOMED CT descriptors to encode any clinical report, could help healthcare professionals to navigate the terminology.

SNOMED CT is a terminology that can cross-map to other international standards and classifications.[3] Specific language editions are available which augment the international edition and can contain language translations, as well as additional national terms. For example, SNOMED CT-AU, released in December 2009 in Australia, is based on the international version of SNOMED CT, but encompasses words and ideas that are clinically and technically unique to Australia.[5]

We will be using the information from the SNOMED  REF2 release to populate our reference term table preparing for the ability to get the hierarchy information later on in the project.

JQuery Plugins

So this week I had started out by working on some code to upload snomed data files and save the data in the system. I had started looking at creating a progress bar to let it be known how long the upload would take. This had to take a back-seat because my mentor was off on business and was not able to help with the confusion I had about how the files mapped to the tables in the database. So, we changed direction and instead I created a Reference Term Browser.

Working on the Reference Term Browser I used the DataTables jQuery plugin. Wow, this really saves a lot of time so I would like to show how I incorporated it into my code with OpenMRS.

It is really quite easy. First: download all the files you will need for the plugin. They are css and js files usually. I needed: jquery.dataTables.min.js and fourButtonPagination.js

Then import them in your gsp with:

ui.includeJavascript(“yourModuleName”, “jquery.dataTables.min.js”);
ui.includeJavascript(“yourModuleName”, “fourButtonPagination.js”);

create a fragment to make a getJSON call for the datatable data:
<script type=”text/javascript”>
jq.getJSON(‘${ ui.actionLink(“yourModuleName”, “browseTableOfReferenceTerms”, “getPage”) }’)
.success(function(data) {

jQuery(‘#demo’).html( ‘<table cellpadding=”0″ cellspacing=”0″ border=”0″ id=”example”></table>’ );
jQuery(‘#example’).dataTable( {
“sPaginationType”: “four_button”,
“aaData”: data,
“aoColumns”: [
{ “sTitle”: “source” },
{ “sTitle”: “code” },
{ “sTitle”: “name” },
{ “sTitle”: “description”}
]
} );
})
.error(function(xhr, status, err) {
alert(‘Reference Term AJAX error’ + err);
});
</script>

Then include that fragment with:

${ ui.includeFragment(“yourModuleName”, “yourFragmentName”)}

and I needed a div for displaying my table:
<div id=”demo”>

</div>

You will also need a controller to get the data when you make the json call. Here are the contents of mine:

package org.openmrs.module.conceptmanagementapps.fragment.controller;
import java.util.ArrayList;
import java.util.List;

import org.openmrs.ConceptReferenceTerm;
import org.openmrs.api.context.Context;
import org.openmrs.module.appui.UiSessionContext;
import org.openmrs.module.conceptmanagementapps.api.ConceptManagementAppsService;
import org.openmrs.ui.framework.page.PageModel;

public class BrowseTableOfReferenceTermsFragmentController {

public List<String[]> getPage() throws Exception {
ConceptManagementAppsService conceptManagementAppsService = (ConceptManagementAppsService) Context
.getService(ConceptManagementAppsService.class);
List<ConceptReferenceTerm> referenceTermList = conceptManagementAppsService.getReferenceTermsForAllSources(0, 200);
List<String[]> referenceTermDataList = new ArrayList<String[]>();
for (ConceptReferenceTerm crt : referenceTermList) {
String[] referenceTermArray = { crt.getConceptSource().getName(), crt.getCode(), crt.getName(),
crt.getDescription() };
referenceTermDataList.add(referenceTermArray);

}
return referenceTermDataList;
}

public void get(UiSessionContext sessionContext, PageModel model) throws Exception {

}
}

That is it. It is very straightforward and really saves time because it has a very organized page that you can scroll through search and filter  with very little code.

Here is a link to the plugin I used if you would like to see it in action:http://datatables.net/index

 

CSV files

This week I have been focusing on cleaning up my code and changing my upload and download to use Super CSV.

Super CSV is a very good tool for reading and writing CSV files. CSV files are simply files which contain multiple rows and each field is delimited by a comma. Some of the difficult parts about working with CSV files is:

  • they may or may not contain headers
  • each field may or may not be enclosed in double quotes
  • within the header and each record, there may be one or more fields, separated by comma
  • the last record in the file may or may not have an ending line break

Super CSV is a library that has been created to help work with some of these difficulties in a csv file.

It has 4 different readers to work with: CsvBeanReader,    CsvDozerBeanReader,    CsvListReader,    CsvMapReader

For my needs I chose the CsvMapReader. It is very useful. First you have to instantiate a reader and then set up your Cell Processors. The cell processors are what help Super CSV effectively parse your file. The cells are then mapped to the header names and put in a Map<String, Object>. Here is an example of the cell processors from the Super CSV site:

private static CellProcessor[] getProcessors() {

        final String emailRegex = "[a-z0-9\\._]+@[a-z0-9\\.]+"; // just an example, not very robust!
        StrRegEx.registerMessage(emailRegex, "must be a valid email address");

        final CellProcessor[] processors = new CellProcessor[] { 
                new UniqueHashCode(), // customerNo (must be unique)
                new NotNull(), // firstName
                new NotNull(), // lastName
                new ParseDate("dd/MM/yyyy"), // birthDate
                new NotNull(), // mailingAddress
                new Optional(new ParseBool()), // married
                new Optional(new ParseInt()), // numberOfKids
                new NotNull(), // favouriteQuote
                new StrRegEx(emailRegex), // email
                new LMinMax(0L, LMinMax.MAX_LONG) // loyaltyPoints
        };

        return processors;
}

Here is how you get the cell processor: final CellProcessor[] processors = getProcessors();

and the headers: final String[] header = beanReader.getHeader(true);

Then when you need to know the value of a field you keep up with the row by:

Map<String, Object> exampleMap; 
while( (examleMap = mapReader.read(header, processors)) != null ) {
}

and simply call mapList.get(“object name  which matches header”). It really saves time because there are many ways to parse a csv file but this has taken the best ways and but them into a useful library.

The rest of the week has been spent trying to make fields easy to understand and making it look better.