Platforms to show: All Mac Windows Linux Cross-Platform

NSLinguisticTaggerMBS class

Type Topic Plugin Version macOS Windows Linux iOS Targets
class Linguistic MBS MacCocoa Plugin 17.3 ✅ Yes ❌ No ❌ No ✅ Yes All
Analyze natural language to tag part of speech and lexical class, identify proper names, perform lemmatization, and determine the language and script (orthography) of text.
Example
dim TagScheme as string = NSLinguisticTaggerMBS.NSLinguisticTagSchemeLanguage
dim TagSchemes() as string = array(TagScheme)
dim t as new NSLinguisticTaggerMBS(TagSchemes)

t.Text = "Hallo Leute"

dim tokenRange as NSRangeMBS
dim sentenceRange as NSRangeMBS
dim tag as string = t.tagAtIndex(0, TagScheme, tokenRange, sentenceRange)

MsgBox "Language: "+tag // should be "de" for German

The NSLinguisticTaggerMBS class provides a uniform interface to a variety of natural language processing functionality with support for many different languages and scripts. You can use NSLinguisticTaggerMBS to segment natural language text into paragraphs, sentences, or words, and tag information about those tokens, such as part of speech, lexical class, lemma, script, and language.
When you create a linguistic tagger, you specify what kind of information you're interested in by passing one or more
NSLinguisticTagScheme values. Set the string property to the natural language text you want to analyze, and the linguistic tagger processes it according to the specified tag schemes. You can then enumerate over the tags in a specified range, using the methods described in Enumerating Linguistic Tags, to get the information requested for a given scheme and unit.

Options

Constant Value Description
NSLinguisticTaggerJoinNames 16 Typically, multiple-word names will be returned as multiple tokens, following the standard tokenization practice of the tagger. If this option is set, then multiple-word names will be joined together and returned as a single token.
NSLinguisticTaggerOmitOther 8 Omit tokens of type NSLinguisticTagOther (non-linguistic items, such as symbols).
NSLinguisticTaggerOmitPunctuation 2 Omit tokens of type NSLinguisticTagPunctuation (all punctuation).
NSLinguisticTaggerOmitWhitespace 4 Omit tokens of type NSLinguisticTagWhitespace (whitespace of all sorts).
NSLinguisticTaggerOmitWords 1 Omit tokens of type NSLinguisticTagWord (items considered to be words).

Units

Constant Value Description
NSLinguisticTaggerUnitDocument 3 The document in its entirety.
NSLinguisticTaggerUnitParagraph 2 An individual paragraph.
NSLinguisticTaggerUnitSentence 1 An individual sentence.
NSLinguisticTaggerUnitWord 0 An individual word.

This class has no sub classes.

Some examples using this class:

Blog Entries

Release notes


The items on this page are in the following plugins: MBS MacCocoa Plugin.


NSLevelIndicatorMBS   -   NSLinguisticValueMBS


The biggest plugin in space...