The following documentation is broken into three sections. The first section is a quick functional introduction to the current Name program. It is organized as a `cookbook' which describes how to perform common functions with Name. The second section is a more detailed description of how the current program works. The final section discusses future plans including known problems, suggestions for how others can help, and the current design document.
Enjoy!
Nathan Wilson
nathan at collectivesource dot com
Once a genus is selected a list of all the species in that genus will appear in the species panel. If a set of genera is selected, then you will need to click on the species panel (or hit the <tab> key) to see the list all the species in all the selected genera. Note that once the focus switches from the list of genera to the list of species you will no longer see the highlighted items in the list of genera.
Once the species panel is selected, you can use the scrolling list or the edit box to selected one or more species in the same way you selected the genera. Once a species or set of species is selected you can add them to all active List Windows by either selecting the Add Selection button or by pressing the <return> or <enter> key. You may need to move the Search Window in order to see the contents of the List Window. The program will warn you if more than one species is being added and give you a chance to review the exact list that will be added. It also warns you if the database considers any of the selected names to be invalid and suggests appropriate accepted names to use instead.
After a taxon has been added to the List Window you can repeat the selection process to add additional species. You can select the genus panel by either clicking on it or pressing the <tab> key while holding down the <shift> key. Once you select the genus panel, any selection in the species panel is cleared.
Taxa not in the database can be added to a species list by first entering the genus and species in the appropriate edit boxes of the respective panels, and then either clicking on the Add Text button or pressing the <return> or <enter> key while holding down the <shift> key.
++
Levels Kingdom Phylum Order Family Genus
Phylum Ascomycota Order Eurotiales Family Trichocomaceae Genus
Penicillium
Family Trichocomaceae Genus Eupenicillium
Family Trichocomaceae Genus Talaromyces
If the above file is added using File->Add... then these three genera will be added to the database. If the above file is simply opened using File->Open..., then a new database will be created with just the three genera, and one family, order, and phylum. If this file is added to the Full Fungi database that comes with this release of Name, then these genera will automatically be part of the kingdom Fungi since the phylum Ascomycotais already in the database.
Full details on this file format are given in the section on the ASCII Extension Format, but there are a few particularly important things to watch out for.
Genus Penicillium = Genus Eupenicillium & Genus Talaromyces
makes the genus Penicillium the accepted equivalent of Eupenicilliumand Talaromyces. It does not, however, change the names of any taxa that might be below Eupenicilliumor Talaromyces.
Common\ Name Green\ Bread\ Mold = Genus Penicillium
associates the name “Green Bread Mold” with the genus Penicillium. Note that spaces that are supposed to be part of a name must be preceeded with a backslash (\). In order to get this example to load as part of a new database, it would also be necessary to add the line,
Categories Common\ name
The categories line should appear either directly above or directly below the Levels line.
The most common use for the search window is to select the genus/species combinations needed for creating species lists. Specific details on how to do this particular task are given in the earlier Cookbook section. However, the Search Window can also be used for more interesting explorations of names.
The search window is made up of a set of Selection Panels. Selection Panels are used to successively reduce the set of names that you are working with. A Selection Panel includes a popup menu which contains a list of `property values' (taxonomic levels and categories), an `Only Accepted' checkbox, an edit box and a list of taxa. 'Common name' is the standard example of a category. Like taxonomic levels, categories describe a set of names. The names that categories describe are referred to as groups. Groups are arbitrary sets of taxa that define the name. Another example of a category might be 'French common names'. An example of group is 'Pine spike' which contains Chroogomphus vinicolor and Chroogomphus rutilus.
The list of names in a selection panel are either at the given level or in the given category. The Selection Panel on the far left will list all the names in the database at the selected level or in the selected category. The names in the next Selection Panel to the right will not only be in the selected level or category, but will also either contain or be contained in the names or groups selected in the previous panel.
At most one of the Selection Panels is selected as indicated by a black border. The selected panel can be changed by clicking anywhere within the panel you want selected. The selected panel indicates which names are actually selected. The panels to the left of the selected panel determines which taxa are listed in the selected panel. Panels to the right of the selected panel only list names if the panel to the immedate left has a selection. If a panel lists only a single name, then that name is automatically selected.
As a simple example, in the default configuration selecting the genus Amanita in the first panel will list the species of Amanita in the second panel. If more than one genus is selected then all the species in all the selected genera are listed. Duplicates are listed only once.
By default search windows have only two Selection Panels. However, additional panels can be created using the Panels->Add Panel menu option. The new panel will be added on the far right of the existing panels. You will need to either scroll the panels or grow the Search Window to see more than two panels. The Panels->Remove Panel menu option deletes the currently selected panel. If there are multiple Selection Panels to the left of a given Selection Panel, then the effects are cummulative. For example, if the first Selection Panel is has the Class Agaricales selected and the second panel has the species smithii selected then if there is a third panel set to the genus level then the genera Agaricus, Conocybe and Volvariella are listed. If the Class Boletales were also selected in the first panel, then the third panel would also show the genera Boletus, Gomphidius and Rhizopogon.
The Search Window provides extensive keyboard support. Typing a name goes into the edit box at the top of the currently selected panel. Changing the text in the edit box causes all taxa that start with the text to be selected. In addition, the up and down arrows clear the edit box and make the selection the item above or below the current selection. Tab and shift-tab move to the following or preceding Selection Panels respectively. Selecting an item with a mouse also clears the Selection String. The shift key and option keys work as expected for multiple selection with the mouse.
Finally, the <enter> or <return> keys add the names selected in the current Selection Panel to all unlocked List Windows. This operation can also be performed by pressing the Add Selection button. Names not in the current database can be added to the List Windows by typing them into the appropriate edit boxes and hitting <enter> or <return> while holding down the shift key or by pressing the Add Text button.
List Windows display lists of collections. In this version of Name, collections are simply a name and an associated date. In the future collections will contain a configurable set of key/value pairs of relevant data. E.g. collector's name, location, habitat, notes etc. A List Window lists each Name once. The triangular icon to the left of a name controls whether the individual collections for that name are listed. Collections or Names can be deleted by selecting the desired item and pressing the delete key.
A List Window can be locked and unlocked by clicking on the Locked check box. When a List Window is locked then new selections from a Search Window do not get added to that List Window.
The contents of a List Window can be saved or loaded using the standard File->Save and File->Open menu items. In addtion, a text species list can be created by selecting the List->Make Species List... menu item. The resulting species list will have each name listed once. In addition if there is more than one collection of the species, then the number of collections is given in parentheses after the name. Species lists created in this way can also be added to an existing List Window using the List->Add Species List... menu item.
The Multi-Selection Window allows the user to see and select from a set of selected species. By default all the names are selected. Names can be removed using the standard interface convention of clicking while holding down the command key. The up and down arrows can also be used to select the top or bottom member of the entire selection or to move the selection up by one if only a single item is selected.
The Rename Window allows the user to know when a name is no longer valid in the database. They are also given the chance use the invalid name if they so choose. The names on the left are those selected from the Search Window. The names on the right are the accepted names according to the database. By default all the accepted names are selected for addition. Clicking on an unselected name, unselects the currently selected version and selects the unselected name. The Select All buttons select either all the original names or all the accepted names as appropriate. The Add Selected button adds the selected names along with previously selected names that were already valid to the appropriate List Windows.
This menu lists all currently loaded databases by their filename.
If a database doesn't have a filename then it is listed as DB <id-number>.
The default database is indicated with a check mark. The default
database can be changed by selecting the appropriate menu item.
Collection List Format consists of a list of collections. Each collection is enclosed in square braces, [ ], and consists of a list of key-value pairs. Each key and value is a string which is separated by the '|' character. A key-value pair is terminated by a carriage return or linefeed. The '\' character is considered an escape character, meaning that a '|', square brace, carriage, linefeed or '\' can be included in either the key or value by preceeding it with a '\'. A '\' preceeding any other character is ignored.
Example output from Name:
[Name|Armillaria sp.
Date|Fri Oct 22 22:56:50 1999
Time|3149621810
]
[Name|Armillaria mellea
Date|Fri Oct 22 22:56:58 1999
Time|3149621818
]
[Name|Floccularia albolanaripes
Date|Fri Oct 22 22:56:56 1999
Time|3149621816
]
[Name|Floccularia albolanaripes
Date|Fri Oct 22 22:56:57 1999
Time|3149621817
]
[Name|Floccularia straminea
Date|Fri Oct 22 22:57:00 1999
Time|3149621820
]
Species List Format consists of lines separated by carriage returns or linefeeds. When this type of file is read, the portion of each line up to the first parenthesis, `(', or the second set of spaces is considered to be the name. If a number occurs after the first parenthesis it is interpreted to be the number of collections that should be created.
Example output from Name:
Armillaria mellea
Armillaria ostoyae
Floccularia albolanaripes (2)
Floccularia straminea
ASCII Name Database Format requires the of the following sections in order:
<the number of levels> <that many level names> <the number of name nodes> <that many names and level indices> <the number of equivalence nodes> <the number of parent/child links> <that many parent indices and child indices> <the number of name node to equivalence node links> <that many name node and equivalence node indices> <the number of accepted parent links> <that many name node and parent indices> <the number of accepted child links> <that many name node and child indices> <the number of accepted equivalent links> <that many name node and equivalence node indices> <the number of accepted value links> <that many equivalence node and name node indices> <genus index> <species index>The format may optionally include all of the following sections in order:
<the number of categories>
<that many category names>
<the number of group nodes>
<that many names and category indices>
<the number of group/name membership links>
<that many group indices and member name indices>
<the number of accepted group/name membership links>
<that many group indices and member name indices>
<the number of preferred name/group links>
<that many name indices and preferred group indices>
Sections are separated by some type of white space. The level names are separated by whitespace. No escape character is defined for level names. Names are enclosed in double-quotes. The '\' character can be used as an escape character within names. Whitespace before or after the double-quotes is ignored. Numbers that are next to each other are separated by whitespace. The meaning of the different terms in the various sections are explained in the design document at the end of this document.
Example data file (note this example does not include the optional section
for groups):
2
Species Genus
4
"Xerocomus" 1 "Boletus" 1 "chrysenteron" 0 "edulis" 0
1
3
0 2 1 2 1 3
2
0 0 1 0
2
2 0 3 1
2
0 2 1 3
0
0
1 0
Binary Name 0.7 Database Format provides an extremely efficient method for saving and loading the database. It is, in essance, the format in which the database is represented in memory. All numbers are four byte long, big-endian, binary integers. In general the value of -1 is used to indicate empty or null values. The format consists of the following sections:
<file format indicator>
<string block size><string block>
<name block size><name block>
<group block size><group block>
<eq block size><eq block>
<level block size><level block>
<category block size><category block>
<free list index>
<list block size><list block>
<genus index><species index>
The <file format indicator> is the two bytes 248 and 236. Each of the sizes are numbers indicating the byte count of the following block. The <string block> is a sequence of null terminated character strings.
The <name block> is a sequence of blocks of eleven numbers which represent the name nodes. The eleven numbers are:
The <list block> is a sequence of blocks of two numbers. The two numbers are:
Binary Name 0.6 Database Format translates the binary database
format used by the previous version of name to the Binary Name 0.7 Database
Format.
ASCII Extension Format provides an easy way for users to add
to and adjust the database to better suit their needs. The format
consists of a series of distinct lines that are each interpreted in turn.
There are six possible types of lines: the Identifer line, the Levels line,
the Categories line, a Name Statement, a Strong Equivalence and a Weak
Equivalence. Below is a description of the syntax for each of these
along with an example and a brief more intuitive description of what the
example does. After these description are some more precise explanations
of how each line modifies the database. The precise explanations
require a better understanding of the database which is largely explained
in the Technical Design section. Exact details on the database
are given in the Developer documentation that comes with the database source.
Full examples of files using this format are included with the software
in files whose names end with .ext. It is not necessary
for files to be named this way for them to be recognized as ASCII Extension
Format.
Identifer line
Must be the first line in the file and consists of two plus signs,
++.
This line allows the program to identify the type of file when you load
the file with either File->Add... or File->Open...
Levels line
An optional line, but it must occur immediately after the Identifier
line or Categories line if it is used. The Levels line starts with
the string `Levels
' and is followed by a list of taxonomic levels
that are used within the rest of the file. If the file is being added
to an existing database and if no Levels line is given, then the levels
within that database are considered valid. If the file is being used
to create a new database and no Levels line is given then only the taxonomic
levels of Kingdom, Genus and Species can be used. The typical Levels
line is:
Levels Kingdom Phylum Order Class Family Genus Species
The levels in the Levels line are only added if they don't already exist
in the database. If a new level is added, then it is added as high
in list of database levels as possible as long as it is below any declared
level to its immediate left in the Levels line.
Categories line
An optional line, but it must occur immediately after the Identifier
line or Levels line if it is used. The Categories line starts with
the string `Categories
' and is followed by a list of cateogories
that are used within the rest of the file. If the file is being added
to an existing database and if no Categories line is given, then the categories
within that database are considered valid. If the file is being used
to create a new database and no Categories line is given then no categories
are consider valid. The typical Categories line is:
Categories "Common Name"
The categories in the Categories line are only added if they don't already
exist in the database.
Name Statement
Declares the existence and acceptance of a set of related names.
It consists of a set of alternating levels and names separated by spaces.
For example,
Genus Armillaria Species mellea
A Name Statement is used to both add new scientific names to the database
and to ensure that the database does not accepted another name over this
one. The above example makes sure that there is a genus Armillaria
and a species Armillaria mellea. It also makes sure Armillaria
mellea is considered valid in the database. It does not, however,
make sure that the genus Armillaria is considered valid. To
do that you would need to have an additional Name Statement that just consisted
of `Genus Armillaria'.
Strong Equivalence
Declares that two or more names are considered equivalent. Each
name is again described by a set of alternating levels and names separated
by spaces or by a category and a name. The first name is followed
by an equal sign, `='. Later names are separated by ampersands,
`&'. The first name is the name that the other names
are considered equivalent to. For example,
Genus Lepista Species nuda = Genus Clitocybe Species nuda & Genus Tricholoma Species nudum
Like a Name Statement, a Strong Equivalence can be used to create new names, but its primary purpose is to declare that some set of scientific names have been renamed or to declare the members of a group. In the above example, Clitocybe nuda and Tricholoma nudum are renamed to Lepista nuda. As a side effect, the line also guarantees that these genera and species all exist in the database. Note that the genera of the renamed species are not effected by the declaration.
Here's an example of a group declaration,
Common\ Name Pine\ Spike = Genus Chroogomphus species vinicolor & Genus Chroogomphus species rutilus
Again the scientific names will be created if they don't already exist.
If a group name is given on the right-hand side of the `=' and the left-hand
side is a scientific name, then that group name is considered to be the
`preferred' group name for that scientific name in the given category.
It is an error for a group name to appear on both the left- and right-hand
sides in a strong equivalence. It is also an error for more than
one group name in the same category to appear on the right-hand side.
Weak Equivalence
Declares that two or more names are historically related. Each
name is described by a set of alternating levels and names spearated by
spaces or by a category and a name. The first name is followed by
a vertical bar, `|'. Later names are separated by ampersands,
`&'. For example,
Genus Lepiota | Genus Leucocoprinus
For scientific names, the Weak Equivalence is like a Strong Equivalence except that the validity of the names is not changed. It is typically used for taxa that have been broken apart into a set of newly accepted taxa. In the above example, both Lepiota and Leucocoprinus remain valid names. However, a weak relationship is established which could be used to interpret unrecognized names. For example, the database could now make the suggestion that Lepiota birnbaumii might refer to Leucocoprinus birnbaumiieven if there is no explicit representation of this name. At the current time the database does not use this information, but it may in the future. In any case, this information can never be more than suggestive since the two names could refer to completely different taxa one of which simply isn't in the database yet.
If a group name is given on the right-hand side, then the scientific
names on the left-hand side are considered `unapproved' members of the
group. This usually means that the name has historically be applied
to this taxon, but it not currently considered appropriate. If a
scientific name is given on the right-hand side and
a group name is on the left-hand side, then that group is no longer
the preferred group name for that scientific name.
Precise Effects
Name Statement: A general database search is done for the first level/name pair. If more than one is found, then an arbitrary one is chosen. For this reason it is strongly recommended that the level of the first name be at or above the Genus level since that should be unique. If no such name is found, then one is created. If additional level/name pairs are given, then they are searched for with the added constraint that the name described by the pairs to its left is an ancestor of the new name. This process repeats to the end of the line. Newly created names are automatically considered to be accepted children of the name described by the name/level pairs to its left, and this parent name is considered an accepted parent.
In addition, the right-most name (the `right-name') is `established'
in the database. This means that any accepted equivalence for the
right-name is cleared. In addition, if there are level/name pairs
to the left of the right-most name (the left-name), then the left-name
is or becomes a fully accepting ancestor of the right-name. In particular,
if the right-name is not an accepted descendant of the left-name, then
the right-name is added as an accepted child of the left-name. If
the left-name is not an accepted ancestor of the right-name then the left-name
becomes the accepted parent of the right-name. Finally, if there
is no left-name then if the right-name has an accepted parent, then the
right-name is an accepted child of that parent.
Strong Equivalence: The level/name pairs to the left of the `=' as wells as the sets of pairs separated by any `&'s are searched for in the same way as a Name Statement. In addition the database is searched for any category/name pairs. A given category/name pair is guaranteed to be unique in a database. Any missing names are added to the database.
If the left-hand side is a group name, then the subordinate scientific names are all added to the given group as accepted members.
If the left-hand side is a scientific name, then the `subordinate scientific names' (the right-most names of each of the sets of name/level pairs to the right of the `='), are then searched for existing equivalences. The `accepted scientific name' (the name to the left of the `='), is made the accepted value for all existing equivalences that have any of the subordinate names as the accepted value.
If there are any existing equivalences whose members are all either the accepted scientific name or some of the subordinate scientific names, then an arbitrary one is chosen as the `target' equivalence. If no target equivalence is found and one is needed, then a new equivalence relationship is created to be the target. A target equivalence may not be needed since it is possible for subordinate scientific names to already refer to one another and to the accepted scientific name. For example, in the Strong Equivalence `Genus Armillaria Species mellea = Genus Armillariella Species mellea', there is only one `Species mella' name needed. It just gets two parents with `Genus Armillaria' as the accepted parent. In this case no target equivalence is needed. If there is a target equivalence, then the accepted name is set as its accepted value and all of the subordinate names are added as values. The subordinate names take the target equivalence as their accepted equivalence. Any group names on the right-hand side are made the preferred group name for the accepted scientific name for the given category.
Each subordinate scientific name that is not the accepted scientific
name is removed from the list of accepted children for its accepted parent.
It is also removed from the accepted children of any name explicitly given
to its immediate left in the name/level pairs. The subordinate names
are not necessarily removed from the accepted children list for all of
its parents, since some of the parents may be unaccepted names that were
defined to accept this child. Finally, the accepted name is established
in the database as in a Name Statement.
Weak Equivalence: Identical to a Strong Equivalence with the following exceptions. The accepted value of existing subordinate scientific names are left unchanged and newly created subordinate names are not given accepted values. The subordinate scientific names are not removed from any accepted children lists and each new subordinate names is made an accepted child of the name to its left in the level/name pair. As in a Strong Equivalence the accepted scientific name is established in the database and becomes the accepted value of the target equivalence. The other equivalences are left unchanged. If there are any group names on the right-hand side and they are currently the preferred group name in their category, then the accepted scientific name no longer has a preferred group name in that category.
If the left-hand side is a group name, then the subordinate scientific
names are all made unaccepted members of the given group.
ASCII Description Format was created to describe organisms as key value pairs and is used by the Taxy database program. It is supported as a read only format in Name in order to leverage a large Taxy database that consists primarily of names. The parser in name looks for strings separated by any of the following five delimiters: ()[],. The '\' character can be used as an escape character within the strings. This format must begin with the two character sequence '(['.
If the string "Genus" is encountered then the next string is read and the database is checked to see if that genus exists. If such a name exists, then any existing accepted equivalence is cleared and the accepted parent is set to accept this child. If no matching genus name is found, then it is created with its accepted parent set to one of the nodes at the top of the name hierarchy. The parser next looks for any comma separated strings that follow the given genus. These names are considered to be candidate equivalents to the first genus. They are only made into official equivalents if no "species" string is found before the next "Genus" string. Practically, this means that genus equivalents are only formed if they occur on a line by themselves*.
Once a genus has been found, if the string "species" is encountered then the next string is read and the database is checked to see if a species level node exists with the latest found genus as an ancestor. If one is found then any existing accepted equivalence is cleared and the latest genus node is made the accepted parent. If no matching species name is found, then it is created with its accepted parent set to the latest genus node. The parser again looks for a set of comma separated equivalents and handles them in the same way as the genus equivalents.
Note that common name information is not extracted from this format.
Example Taxy database:
([Genus(Chlorophyllum)Edibility(Poisonous)References(MD2)]
[Author(\(Fries\) Mass.)Genus(Chlorophyllum,Lepiota)species(molybdites,morgani)
Common name(Green-Spored Parasol)Collections(Nathan,Gregg)Edibility(Poisonous)References(MD2)]
[Genus(Lepiota)Edibility(Caution,Unknown,Edible,Choice,Dangerous,Poisonous,Deadly,Good,Hallucinogenic)References(MD2)]
[Genus(Lepiota)species(acutesquamosa)Edibility(Unknown,Edible)References(MD2)]
[Author(Zeller)Genus(Lepiota)species(atrodisca)Common name(Black-Eyed
Parasol)
Collections(Gregg)Edibility(Dangerous,Unknown)References(MD2)]
[Author(Zeller)Genus(Lepiota)species(barsii,barssii)Common name(Gray
Parsol)
Collections(Gregg)Edibility(Choice,Caution)References(MD2)]
[Genus(Lepiota,Leucocoprinus)species(brebissonii)Edibility(Caution,Unknown)References(MD2)]
[Genus(Lepiota,Leucocoprinus)species(breviramus)Edibility(Caution,Unknown)References(MD2)]
[Author(\(Fries\) Kummer)Genus(Lepiota)species(clypeolaria)
Common name(Shaggy-Stalked Parasol)Collections(Nathan,Gregg)Edibility(Poisonous)
References(MD2)]
[Author(\(Fries\) Kummer)Genus(Lepiota)species(cristata)Common
name(Brown-Eyed Parasol)
Collections(Nathan,Gregg)Edibility(Unknown,Dangerous)References(MD2)]
[Author(\(Fries\) Singer)Genus(Leucoagaricus,Lepiota)
species(leucothites,naucinus,naucina,naucinoides)
Common name(Smooth Parasol,Woman on Motorcycle)Collections(Nathan,Gregg)Edibility(Good)
References(MD2)]
[Author(\(Fries\) Locquin)Genus(Leucocoprinus,Lepiota)species(birnbaumii,luteus,lutea)
Common name(Yellow Parasol,Flower Pot Parasol)Collections(Gregg)Edibility(Caution,Poisonous)
References([MD2[color(#70)]])]
[Author(\(Fries\) Pat.)Genus(Leucocoprinus,Lepiota)species(cepaestipes)
Common name(Onion Stalk Parasol)Edibility(Caution,Edible)References(MD2)]
[Genus(Macrolepiota)Edibility(Edible)References(MD2)]
[Author(\(Vitt.\) Singer)Genus(Macrolepiota,Lepiota,Leucoagaricus)species(rachodes,rhacodes)
Common name(Shaggy Parasol)Collections(Nathan,Gregg)Edibility(Choice,Caution)
References([MD2[color(#69)]])]
[Genus(Macrolepiota,Lepiota,Leucoagaricus)species(procera,procerus)
Common name(Parasol Mushroom)Edibility(Choice,Caution)References(MD2)])
* One slightly unintuitive side-effect of this policy is that segregate
genera that are no longer accepted should be explicitly made equivalents
of the genera that they are now included in when possible. Otherwise
adding an indeterminate collection assigned to the unaccepted segregate
genus will show up using that unaccepted name. A case in the 0.2
database where there is no really satisfying solution is the genus Scutiger.
This genus has been split across the genera Polyporus and Albatrellus so
there is no single appropriate equivalent. If you select this genus
and add it as an indeterminate species to a species list it will show up
as Scutiger sp. even though techincal it is not a valid name.
However, as long as you select a known species within Scutiger it
will be correctly assigned to Polyporus or Albatrellus as
appropriate.
File issues:
In order demonstrate the power and functionality of the representation, a second goal is to create a tool for creating species lists and to maintain simple collection information for use during forays and fairs.
There are a number of advantages to using third party solutions. Using an existing database or spreadsheet program means that many of the low level issues have already been dealt with such data storage and retrieval, in some cases distributed network access, user interface building and printing. As a result development time can be substantially shorter. In addition, maintaining and extending such systems is typically easier as long as you remain within the bounds of the system. In general the third party systems have relatively easy to learn tools for adding or changing database functionality. In comparison computer languages such as C++ and Java require more experience to use effectively.
However, developing the program using standard development languages has a number of advantages. First, the developers have substantially greater control over all the details of the program. Ultimately this means that the resulting system can be substantially more efficient in terms of speed and memory requirements. This advantage is particular important for this application since many of the user are expected to be using older computer systems.
In addition, third party solutions typically lock the developer into a particular style of data representation. This rigidity can have profound effects on the efficiency of desired features and can even make certain operations impossible. As an example, the relational database query language SQL is generally recognized as one of the leading database technologies. Most of the third party tools do not support a language as powerful as SQL and most of those that do actually use SQL. Unfortunately, SQL is well known to have difficulty computing results that can require a variable number of database accesses to derive the result. An example from the Name is computing the complete Latin name of a taxon. Since a taxon can be at any level from kingdom to form, printing the complete Latin name requires a variable number of accesses into the database based on how deep the name is in the taxonomic hierarchy.
Another advantage to using standard development tools, is that the complete source code for the program can be distributed with the program. This means that the program is not dependent on the creators of the third party tool to support whatever types of computers you want to run the program on. In addition, it is much more likely that over time standard programming languages will remain in use than any of the particular third party tools. A program created with standard programming languages are also by their nature more generally extensible and are much more likely to be easily integrated with other systems.
Finally, third party tools tend to be more expensive than development environments. This is particular a problem if users rather than just developers have to pay to get the system to work.
A Group Node contains a name string and a list of 'accepted' Name Nodes as well as a list of all the Name Nodes that have ever had the group name applied to them. A given Group Node is a member of a 'Category'. Categories are primarily intended to support different languages. This means that English common names and French common names can not only have different name strings, but they can also contain different sets of taxa.
A Name Node contains a name string and a taxonomic level, e.g. "muscaria" and "species". The taxonomic levels are assumed to be completely ordered, meaning that for any two levels one is always higher than the other, e.g., genus is higher than species. All of the standard taxonomic levels are supported including sub-generic levels such as subgenus and section as well as sub-species levels like subspecies, variety and form. The most significant implication of this choice is that the standard Latin binomial (genus followed by species) cannot easily be used to ensure uniqueness. In fact the representations do not in anyway take advantage of the supposed uniqueness of genus names or names for any levels above genus. In fact, from a larger historical perspective this is necessary since violations of such uniqueness rules have occurred.
Because the combination of a name string and a taxonomic level is not a unique specifier (e.g. "smithii" and "species" refers to a large number of fungal species), a Name Node also has a unique id. This unique id allows a given Name Node to refer to a single taxon. A given taxon however, can be referred to by more than one Name Node. This is necessary if the taxon has been referred to by different names, e.g. the White Matsutake (Tricholoma magnivelare which used to be known as Armillaria ponderosa) would be represented by separate Name Nodes for the species epithets that have been applied to it, i.e., "magnivelare", and "ponderosa".
Name Nodes are connected to each other by several distinct types of links. The simplest are parent/child links. These links indicate any connections that have ever been made between Name Nodes at different levels. Thus there would be a parent/child link between the node for the genus Tricholoma and the species magnivelare as well as one between the genus Armillaria and the species ponderosa. Note that a given Name Node can have multiple parents as well as multiple children. For example, the species Armillaria mellea was historically known as Armillariella mellea, therefore the Name Node for mellea has both Armillaria and Armillariella as parents. In computer science terms, the parent/child links create a Directed Acyclic Graph or DAG.
In addition to parent/child links, there are accepted parent and accepted child links. These links indicate which parent/child links are currently considered to be 'accepted'. Thus there would be an accepted parent link from mellea to Armillaria and an accepted child link from Armillaria to mellea. Unlike the parent/child links the accepted parent and accepted child links are thought of separately. This allows Armillariella to have an accepted child link going to mellea, or for ponderosa to have an accepted parent link going to Armillaria. Finally, a particular Name Node can have at most one accepted parent link, though of course it can have any number of accepted child links. Thus the accepted links form a strict hierarchy.
In addition to Name Nodes there are a set of Equivalence Nodes. Equivalence Nodes represent collections of Name Nodes that refer to the same taxon. As with Name Nodes, Equivalence nodes have simple bi-directional member links and separate accepted links. Name Nodes have at most one 'accepted equivalent' link. The presence of an accepted equivalent link, indicates that a particular Name Node is not considered a valid name. Every Equivalence Node has exactly one 'accepted value' link which points to the Name Node which is or has been the valid name for all the members of the Equivalence Node. A Name Node which is the 'accepted value' for an Equivalence Node cannot have that Equivalence Node as its 'accepted equivalent'.
It can also be the case that a particular Equivalence Node has no Name Node that accepts it. In addition, Name Nodes can have more than one equivalent. An example of these cases are the genera that are members of Lepiota sensu lato (in the broad sense). These include the genera Lepiota, Leucoagaricus, Leucocoprinus and Macrolepiota. Some authors do not accept any of the last three and call them all Lepiota. Other authors accept Leucoagaricus and Leucocoprinus but reject Macrolepiota and so on. The members of these genera were once considered to be members of Lepiota, but the other three genera have never been considered to overlap. Hence the genus Lepiota needs to have three separate Equivalence Nodes which include each of the three other genera. In addition, if the user chose to accept all four genera then none of the Name Nodes would have an accepted equivalent. The Equivalence Node should however have Lepiota as their accepted value since it is the older name.
1) Basic standard functionality - File saving and loading, cut and paste, window manipulation, quitting etc. This functionality is provided through standard pull down menus.
2) Name search and selection - The process by which sets of Name and Group Nodes are selected for furthering processing.
3) List maintenance and manipulation - The process by which lists of Name and Group Nodes, or 'Collection Lists', are created and manipulated.
4) Name and relationship editing - The process by which the Name, Equivalence and Group Node structures are modified including the creation of new nodes.
5) Collection description - The process by which collection information is associated with the Name Nodes in a Collection List. This includes specifying data for new collections, editing data of existing collections and determining what data should be collected.
Because the Search Window will be heavily used by most users, extensive keyboard support is provided. At any given time there is at most a single selected Selection Panel. This Selection Panel is high-lighted. If any items are selected in the selected Selection Panel, then the list for the panel to its immediate right is displayed. If that panel only contains a single item, then it is automatically selected and the panel to it's immediate right is computed and so on. Selection Panels have a 'Name List', a 'Selection' and a 'Selection String' that are used to determine and modify the set of nodes currently selected. The Selection is the set of items that are currently selected in the Name List. All items in the Selection are high-lighted. The Selection String is a visible, editable string that is a prefix for the Selection. If the Selection String is modified, the Selection is changed to all the items which matches the Selection String. The Selection can also be modified directly through the Name List.
The up and down arrows clear the Selection String and make the Selection the item above or below the current Selection. Tab and shift-tab move to the following or preceding Selection Panels respectively. Selecting an item with a mouse clears the Selection String. Unless the appropriate modifier key is held down, the Selection is set to the selected item. Otherwise, the selected item is added to the Selection.
Finally, the <enter> or <return> keys add the Selection from the current Selection Panel to all unlocked List Windows. This operation can also be performed by pressing the Add Selection button. Names not in the current database can be added to the List Windows by typing them into the appropriate Selection Strings and hitting <enter> or <return> while holding down the shift key or by pressing the Add Text button.
The Create Taxon Window allows the user to specify a name and a level for a new node and allows the user to select the accepted parent for the new node. The parent selection is handled in the same way as the Search Window and is initialized to the current values in the Search Window. The information is only added to the database when the user presses the Create button.
The Rename Taxon Window allows the user to select a name node through the usual mechanism. Once a name node is selected, a scrolling list of that node's equivalent names is provided for the user to choose from. Selecting from this list changes the accepted and equivalence information for the selected taxon. The user can also directly add a new name. When they enter a new name, the user can request the system to search through the parents of the node for another node with the same name. If one or more are found they are added to the list of equivalents that can be selected.
Renaming a node in this way can have significant indirect effects on other nodes. In particular, if the selected alternative is not currently accepted, then this node and all its children will become inactive. By default the system tries to ensure that the new node is accepted, but this behavior can be turned off. In addition, when a node is renamed it is unclear what the desired behavior is for the children of the node. Should they remain attached to the old name and thereby not be accepted or should they become children of the new name node? The user has the choice to transfer all the children to the new node, transfer only the accepted children or transfer none of the children. In all cases the children remain children of the original node. The default behavior is for all the children to be transferred.
Because the effects of renaming can be confusing, the Rename Window provides a 'Review' button which lists all the name changes that will occur when a rename action is taken. Once the desired behavior is found, the user has a final choice of whether to make the new name also synonymous to other synonyms, create it as an independent synonym or to try to actually replace the name. The last of these behaviors is only possible if the node has been added during the current session. It is intended primarily to correct mistakes and typos.
The Transfer Taxon Window allows users to add parents or change the accepted parent of a taxon. The window allows the user to select the target taxon and a new parent taxon. By default the new parent becomes the accepted parent. Normally a link to the previous parent is maintained by the system. However, if the given taxon was created during the current session then the user can chose to break that link.
The Group Window allows the user create and modify named, arbitrary collections of taxa.
The Link and Node Editor allows the user to add or remove arbitrary links between any of the nodes used by the system. This interface is intended to be used by experienced users for making unusual modifications to the structure. Basic sanity checks are made to ensure that the system remains consistent. A taxon can be selected in a Link Editing Window in the usual way. The window contains a scrolling list of link types. When a link type is selected, a scrolling list of existing links of that type is displayed as well as a link type specific method for selecting other nodes. Existing links can be selected and removed. New nodes can be selected and linked in.