e olif The olif element is the base document element of a document in Open Lexicon Interchange Format (OLIF). a OlifVersion The OlifVersion attribute holds data about the version of OLIF to which the XML instance (document) conforms. The OLIF Consortium publishes the string identifier that might be used for the OlifVersion attribute. e body The body element groups a list of entries which contain linguistic/lexical/terminological data categories for entry strings/designators. e entry The entry element groups all of the linguistic/lexical/terminological data categories related to a single entry string/designator. e mono The mono element groups the monolingual data within an entry. e crossRefer The crossRefer element groups the data categories for cross-references. Cross-references define relations between the given entry (link source) and other entries in the lexicon (link target) in the same language. e crLinkType The crLinkType element classifies the relation between the entry from which the link originates and the entry to which the link points. The possible relations include ISO relations (most of which formally apply to concepts rather than the terms themselves; they have been adapted here for the purposes of OLIF) and the analysis contained in EuroWordNet (July, 2000). Example values: synonym, antonym e orthVariantType The orthVariantType element classifies the type of orthographic variant that the target of a cross-reference represents (currently only used for German; used for example to list old/new spelling) represents. Example values: german-4 e monoDC The monoDC element groups optional data categories for administrative, morphological, syntactic and semantic data. e monoAdmin The monoAdmin element groups the administrative data within a monolingual entry. e userDesignat The userDesignat element holds a user designator of an entry string. The userDesignat element can be used if a need exists to represent the entry string not just in canonical form. e syllabification The syllabification element holds data about the syllable boundaries within the entry string. Example use: do-cu-men-ta-ry, li-be-ra-li-ty e geogUsage The geogUsage element holds data about the geographical usage, or dialect, of the entry string. Example values: CA, GB e entryType The entryType element classifies the entry string as being a product name, trademark, or orthographic variant (note that orthographic variants may also be encoded as cross-references). Example values: trademark, orth-var e entryFormation The entryFormation element classifies the shape/structure of the entry string. Example values: abb, acr e phraseType The phraseType element classifies the phrasal type of an entity. Example values: mw e entryStatus The entryStatus element classifies the entry status of an entry within a given lexicon/termbase (note that there exists a separate data category for the administrative status). Example values: word e entrySource The entrySource element holds data about the entry source, or the lexicon/termbase that the entry originated from. Example use: TermDB for software package X e originator The originator element holds data about the individual who originated the entry. Example use: Christopher Columbus e adminStatus The adminStatus element classifies the administrative status of an entry relative to a given work environment. Example values: ver e company The company element holds information about the company/organisation for which the entry is valid. Example use: LongDistanceRunners Ltd. e abbrev The abbrev element holds data about an abbreviated form of the entry string (note that abbreviations may also be encoded as cross-references). Example use: ERP e orthVariant The orthVariant element holds data about an orthographic variant of the entry string (note that orthographic variants may also be encoded as cross-references). Example use: auf Grund e depSynonym The depSynonym element holds data about a rejected or deprecated synonym of the entry string. Example use: IS-H e timeRestrict The timeRestrict element holds data about a time restriction, or the period of time during or since which usage of the entry is valid. Example use: 20011115T140324Z/20011215T140324Z e product The product element holds data about a product for which an entry is valid. Example use: Spreadsheet3005 e project The project element holds data about a project for which an entry is valid. Example use: localization of product X from English into German e confidence The confidence element holds data from terminology extraction. The value of the confidence element indicates, how confident the term extraction program is, that the term really is a term. Example values: 0.99, high e monoMorph The monoMorph element groups the morphological information within a monolingual entry. e morphStruct The morphStruct element holds data about the morphological structure of the entry string (note the possibilities provided for multiwords by means of the synStruct element). Example use: #[[gebrauch+s]:[gegen+stand]]# e inflection The inflection element holds data about the inflection pattern(s) of the entry string (or its head in case of a multiword/phrasal entry). Example use: book, 16 e head The head element holds data about the head word in a multiword/phrasal entry string. Example use: infotype (planned compensation infotype) e gender The gender element classifies grammatical gender. Example values: m, f e case The case element classifies grammatical case. Example values: d, a, loc e number The number element classifies grammatical number. Example values: sg, du e person The person element classifies grammatical person. Example values: first, sec e tense The tense element classifies verb tense. Example values: pres, fut e mood The mood element classifies verb mood or mode. Example values: imper, cond e aspect The aspect element classifies verbal aspect. Example values: perf, iter e degree The degree element classifies adjectival degree type. Example values: comp, sup e auxType The auxType element classifies the auxiliary type for an auxiliary verb. Example values: have, faire e monoSyn The monoSyn element groups the syntactic information within a monolingual entry. e synType The synType element classifies the general syntactic behavior of the entry string. Example values: cnt, refl, attrib e synPosition The synPosition element classifies the unmarked positioning of the entry string syntactically. Example values: prenoun, cl-init e transType The transType element classifies the transitivity type of a verb. Example values: trans, ditrans e synStruct The synStruct element holds data about the constituent structure of a multiword entry string (note the possibilities provided for single words by means of the morphStruct element). Example use: [[adj][noun]] (General Ledger) e synFrame The synFrame element classifies the syntactic frame for the entry string (subcategorisation). Example values: subj-imps-opt, dobj-opt e prep The prep element holds data about prepositions that further specify syntactic frame elements. Example use: into, about, from, mit, wegen, ausser e verbPart The verbPart element holds data about verb particles that further specify syntactic frame elements. Example use: down, up, over e monoSem The monoSem element groups the semantic information within a monolingual entry. e definition The definition element holds a prose definition of the entry string. Example use: Collection of interfaces usable by a programmer e natGender The natGender element classifies the biological gender associated with the entry. Example values: m, f, un e semType The semType element classifies an entry string with respect to a semantic type classification structure. Example values: anim-hum-pn, cnc-class e header The header element groups data categories information about the data that has been encoded (thus, header holds meta-data). e dataCatReg The dataCatReg element groups data categories for extensions to extensible OLIF data categories (like ptOfSpeech). The idea is that whenever a user chooses to make use of a user extension (and for example supplies his own tag set for part-of-speech), he explains the overall listing of the data categories and values he uses (for example via a URL that he puts into the ptOfSpeechDCS element of the dataCatReg element). The dataCatReg element contains several data category specifications (DCS). e ptOfSpeechDCS The ptOfSpeechDCS element (DCS is short for data category specification) holds data about a user-extended scheme for describing the part-of-speech of OLIF entries. Users can for example describe their additional part-of-speech tags by means of a URL or by means of CDATA sections. Example uses: http://www.company.com/nlp/ptOfSpeech/projectX.htm e subjFieldDCS The subjFieldDCS element holds data about a user-extended scheme for describing the subject field information of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e semReadingDCS The semReadingDCS element holds data about a user-extended scheme for describing the semantic reading information of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e crLinkTypeDCS The crLinkTypeDCS element holds data about a user-extended scheme for describing the types of cross-references between OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e orthVariantTypeDCS The orthVariantTypeDCS element holds data about a user-extended scheme for describing the orthographic variants of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e morphStructDCS The morphStructDCS element holds data about a user-extended scheme for describing the internal morphological structure of entry strings/designators (see the comment for the ptOfSpeechDCS element for more information). e inflectionDCS The inflectionDCS element holds data about a user-extended scheme for describing the inflection of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e aspectDCS The aspectDCS element holds data about a user-extended scheme for describing the aspect of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e synTypeDCS The synTypeDCS element holds data about a user-extended scheme for describing the syntactic type of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e synFrameDCS The synFrameDCS element holds data about a user-extended scheme for describing the syntactic frames of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e synStructDCS The synStructDCS element holds data about a user-extended scheme for describing the syntactic structures of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e semTypeDCS The semTypeDCS element holds data about a user-extended scheme for describing the semantic types of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e conceptHierarchyDCS The conceptHierarchyDCS element holds data about a user-extended scheme for describing the concept hierarchy/ontology of OLIF entries (see the comment for the ptOfSpeechDCS element for more information). e contentInfo The contentInfo element groups data categories related to the practice adopted for encoding quotation marks, abbreviations etc. e quotMarkInfo The quotMarkInfo element holds data about editorial practice adopted with respect to quotation marks. Example use: our open quote is '!' and our closing quote is '$' e syllabificationMarkInfo The syllabificationMarkInfo element holds data about editorial practice adopted with respect to syllabification in the original. Example use: we use '*' as marker e abbrevHandling The abbrevHandling element holds data about the way how abbreviations are represented. Two options exist: via the abbrev element or via a crossRefer element. Example use: we use both the abbrev element, and the crossRefer element e langIdUse The langIdUse element holds data about the way language identifers have been used. Possible values: region_standard - the region part of a locale (e.g. the CA in FR_CA) has been used even if the term also exists in the unrestricted locale (e.g. French as a whole). region_exception - the region part of a locale only has been used if the term does not exist in the unrestricted locale. e valueDefaults The valueDefaults element groups information about the default values for various data categories. Whenever an OLIF entry does not specify a value for one of these data categories, information from the valueDefaults element should be applied. e valDefault The valDefault element holds data about the default value for one specific data category. Example use: The example below shows how to set the default for the data category 'product' to the string 'OLIF Converter': OLIF Converter e workflowInfo The workflowInfo element holds data about user-specific workflow support. Example use: to be validated by 31 Dec 2001 at the latest e termExtractInfo The termExtractInfo element holds data which is relevant for terminology extraction (e.g. name and size of corpus to which term extraction has been applied). e fileDesc The fileDesc element groups data categories relating to physical features of the OLIF instance (document). e fileName The fileName element holds data about the name of the OLIF file. Example use: olifForAgency14Jan02.xml e fileId The fileId element holds data about a unique identifier (e.g. a globally unique identifier) of the OLIF file. Example use: 011000358700000683362001E.xml e fileExtent The fileExtent element groups data categories related to counts of items (for example number of entries) in the contents of the OLIF instance. e conceptCount The conceptCount element holds data about the number of concepts in the OLIF document. e entryCount The entryCount element holds data about the number of entries in the OLIF document. e termCount The termCount element holds data about the number of terms (generally defined as those entries which are both not general vocabulary and distinguished from one another by the values of the key data categories) in the OLIF document. e byteCount The byteCount element holds data about the size of the OLIF document including its tags, in its representation as a text file encoded in the character set mentioned in the encoding attribute of the XML declaration. This is useful for calculating media requirements or file download times. e publStmt The pubStmt element groups data categories related to the distributor and the owner of the OLIF document. The publStmt element also gives supplementary information about the OLIF document (e.g. copyright protection). e distributor The distributor element holds data about the person or institution who distributes the OLIF document. e address The address element holds data about a postal address of the distributor. e telephone The telephone element holds data about the telephone number of the person or institution who distributes the OLIF file (preferably in a format conformant to ITU-T/CCITT Recommendation E.123). e fax The fax element holds data about the fax number of the person or institution who distributes the OLIF file (preferably in a format conformant to ITU-T/CCITT Recommendation E.123. e eAddress The eAddress element holds data about an electronic address of the person or institution who distributes the OLIF file. Note that more than one occurrence of this tag can appear, so that multiple addresses (possibly of different types) can be included. e availability The availability element holds data about the availability of an OLIF file, for example, any restrictions on its use or distribution, its copyright status, etc. A company may use 'Available upon written agreement' to indicate that the OLIF file may not be freely redistributed. e idNo The idNo element holds data about a number (e.g. ISBN) used to identify an OLIF document. e date The date element holds data about a date. Its value must be in ASCII, in the format YYYYMMDDThhmmssZ. (e.g. 19970811T133402Z for August 11th 1997 at 1:34pm 2 seconds.) This is one of the options described in ISO 8601:1988. The value is preferably given in Coordinated Universal Time (UTC; as indicated by the terminal Z). The DateValue attribute can be used to specify the date in an arbitrary format. e owner The owner element holds data about the person, or institution that owns the OLIF document. e replacements The replacements element groups data categories for string replacements that should be applied to the document. The replacement element helps to compress data and might for example specify one value for the date element of a list of 1000 elements. e mapping The mapping element groups a mapValue and a mapTarget. The mapValue should be used for the item designated by the mapTarget. e mappingValue The mapping element holds data about a replacement string that is used in a mapping. e mappingTarget The mappingTarget element holds data about an item to which a replacement should be applied. e name The name element holds data about a name (e.g. of a distributor or owner). e prop The prop element holds data about non-standard (proprietary) information in an OLIF document. It may be used for communicating tool-specific information. a CreaTool The CreaTool attribute holds data about the tool that created the OLIF document. Its possible values are not specified in OLIF but each tool provider will publish the string identifier it uses. Example use: CoolTermExtract a CreaToolVersion The CreaToolVersion attribute holds data about the version of the tool that created the OLIF document. Its possible values are not specified in OLIF but each tool provider will publish the string identifier it uses. Example use: 2.14 a OrigFormat The OrigFormat attribute holds data about the format of the file from which the OLIF document has been generated. The format specification may include a product name and even a version tag. This may lead to format specifications like the following: LOGOS-eSense LOGOS-LDE-1.1 LOGOS-LDE-1.2 a AdminLang The AdminLang attribute holds data about the default language for the administrative and informative elements 'note' and 'prop'. The value of the AdminLang attribute must be one of the ISO 3166/639 language identifiers (2 or 3-letter code) or one of the standard locale identifiers (2 or 3-letter language code, dash, 2-letter territory/country code). Example use: en a CreaDate The CreaDate attribute holds data about the date of the creation of the element. Its value must be in ASCII, in the format YYYYMMDDThhmmssZ. (e.g. 19970811T133402Z for August 11th 1997 at 1 hour 34 minutes 2 seconds.) This is one of the options described in ISO 8601:1988. The value should be given in Coordinated Universal Time (UTC; as indicated by the terminal Z). Example use: 19970811T133402Z a CreaId The CreaId attribute holds data about the user who created the element. Example use: Lars Nauter a DCSType The DCSType attribute classifies a data category specification. Possible values: replacement - replace existing OLIF values extension - extend (add to) the predefined OLIF values. a InflectionDCSType The InflectionDCSType attribute classifies the way how inflection information has been encoded. Possible values: classDesignator - reference to a code/designator from a classification scheme inflectsLike - example a QuotMarkRet The QuotMarkRet attribute classifies the convention used for retaining quotation marks. Possible values: none - no quotation marks have been retained some - some quotation marks have been retained all - all quotation marks have been retained a QuotMarkForm The QuotMarkForm attribute classifies the standardization of quotation marks. Possible values: std - use of quotation marks has been standardized and open and close quote marks are distinct nonStd - open and close quote marks are represented indiscriminately unknown*- use of quotation marks is unknown a ValDefaultRefType The ValDefaultRefType attribute classifies the OLIF item to which a value default refers. Possible values: el - element att - attribute en - entity. a ValDefaultRefName The ValDefaultRefName attribute holds data about the name of the element, attribute or entity to which a value default is related. a ByteCountUnit The ByteCountUnit attribute classifies the unit in which the bytecount is measured. Possible values: bytes - bytes kb* - kilobytes mb - megabytes gb - gigabytes a DistributorType The DistributorType attribute classifies a distributor. Possible values: person - name of a person place - name of a place org - name of an organization article in a periodical cmp - name of a company a EAddressType The EAdressType attribute classifies the electronic address (email address, web site, ftp site, etc.). Possible values: email* - the value is an electronic mail address url - the value is an URL a Region The Region attribute holds data about the territories within which rights related to the OLIF data apply. Possible values: world* - the text is freely available eu - European Union only a PubStatus The PubStatus attribute classifies the current availability of the OLIF data. Possible values: restricted - the text is not freely available unknown* - the status of the text is unknown free - the text is freely available a IdNotype The IdNoType attribute holds data about a name or abbreviation (e.g., isbn) identifying what type of identifying number is given. Possible values: isbn* - the value is an International Standard Book Number (ISBN) number a DateValue The DateValue attribute holds data about the a date in ISO 8601 format. a OwnerType The OwnerType attribute classifies an owner. Possible values: natPerson - name of a person place - name of a place org - name of an organization article in a periodical cmp - name of a company a PropType The PropType attribute holds data about the kind of data a prop element represents. a PropLang The PropLang attribute holds data about the language used in a prop element. e keyDC The keyDC element groups the five key data categories whose values uniquely identify an entry. e canForm The canForm element holds the entry string, represented in canonical form in accordance with OLIF guidelines. Example use: success story e language The language element encodes the language to which the entry string belongs. Example values: fr, en e ptOfSpeech The ptOfSpeech element classifies the part-of-speech represented by the entry string. In cases of phrases/multiword entries, the value for part-of-speech depends on the function of the phrase/multiword within a clause; the part-of-speech of the head element often indicates the value for part-of-speech value for the entire phrase/multiword string. Example values: noun, verb e subjField The subjField element classifies the knowledge domain to which the lexical/terminological entry is assigned. Example values: agriculture, aviation e semReading The semReading element classifies readings for entries with identical values for canonical form, language, part-of-speech, and subject field. Example values: color, definite space e generalDC The generalDC element groups general data categories. General data categories are optional elements that can be used in any of the top-level OLIF groups for entries (mono, crossRefer, or transfer). e updater The updater element holds data about the individual who last modified the entry. Example use: Jessica King e modDate The modDate element holds data about the date on which the entry was last modified. Example use: 20011115T140324Z e example The example element holds data about a sample text or portion of text that contains the entry string as an illustration of usage. Example use: ERP is on the rise again. e usage The usage element holds data about a usage note for the entry string. Example use: Never use this when talking about ERP. e note The note element holds data about a note, or commentary, on an entry by a lexicographer/terminologist. Example use: Never translate this. e locInfo The locInfo element holds data about localization-relevant information (e.g. product version, component name, operating system platform, or build number). a KeyDCUserId The KeyDCUserId attribute holds data about a user-defined identifier of a grouping of OLIF key data categories. This identifier can for example be used in cross-references. a KeyDCUniversalId The KeyDCUniversalId attribute holds data about a universal identifier (ie. one which is unique, not only in the user's environment but worldwide) of a grouping of OLIF key data categories. This identifier can for example be used in cross-references. a NoteType The NoteType attribute holds data for categorizing notes (e.g. 'for localizer', 'for quality management'). e transfer The transfer element groups data categories which define bilingual transfer relations between the given entry and other entries in the lexicon in different languages (cf. to crossRefer elements which point to entries in the same language). e trRestrictStmt The trRestrictStmt element groups multiple related transfer restrictions (eg. alternatives connected via the logical operator OR). e trRestrict The trRestrict element groups data categories for a single transfer restriction. e structChangeStmt The structChangeStmt element groups multiple related structural changes (which can be connected via the logical operator AND). e structChange The structChange element groups data categories related to a change in the target language vis-a-vis the source structure based on the transfer restriction having been satisfied. Structural changes are definable for the following parts-of-speech: noun, verb, adjective, preposition. e changeType The changeType element holds data related to the type of change. Example values: change-role, add-in-target e changePOS The changePOS element holds data about the part of speech of an element being added or deleted Example values: noun, adj e changeValue The changeValue element holds data about the string or data category being changed. Example values: active, subj-dobj e equival The equival element holds data about the degree of transfer relationship between words/phrases in two different languages. Example values: full, partial e contextStmt The contextStmt element groups multiple related contexts (contexts can be connected by means of logical operators). e context The context element holds data about one of the following: a) the context for a given translation of a source word/phrase into a target word/phrase b) the context for a structural change in the target language Example values: pp, genobj e testStmt The testStmt element groups multiple related tests (connected by means of logical operators). e test The test element holds data about a single test. e testType The testType element holds data about the type of test. Example values: string, datacat e testDC The testDC element holds data about a data category to which a test pertains. Example values: semType, tense e testValue The testValue element holds data about the string or data category being tested in the context(s) (eg. 'sg' if the test is on the data category for grammatical number). Example values: anim-hum, sg e logOp The logOp element holds data about a logical operator. Possible values: AND - for trRestrictStmt and structChangeStmt OR - for trRestrictStmt NOT - for trRestrictStmt e logOpAnd The logOpAnd element holds data about the logical operator AND. a TrTarget The TrTarget attribute holds data about the target entry of a transfer relationship. a TrDefault The TrDefault attribute holds data about the default transfer. e workflowInfo The workflowInfo element holds data about workflow-related information like the task that is currently performed, its deadlines, and the person responsible for executing the task.