mindmeld.components.nlp module¶

This module contains the natural language processor.

class mindmeld.components.nlp.DomainProcessor(app_path, domain, resource_loader=None, progress_bar=None)[source]¶

Bases: mindmeld.components.nlp.Processor

The domain processor houses the hierarchy of domain-specific natural language processing models required for understanding the user input for a particular domain.

name¶: str -- The name of the domain.

intent_classifier¶: IntentClassifier -- The intent classifier for this domain.

inspect(query, intent=None, dynamic_resource=None)[source]¶

Inspects the query.

Parameters:	query (Query) -- The query to be predicted. intent (str) -- The expected intent label for this query. dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference.
Returns:	2D list that includes every feature, their value, weight and probability
Return type:	(list of lists)

process(query_text, allowed_nlp_classes=None, locale=None, language=None, time_zone=None, timestamp=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given input text using the hierarchy of natural language processing models trained for this domain.

Parameters:	query_text (str, or list/tuple) -- The raw user text input, or a list of the n-best query transcripts from ASR. allowed_nlp_classes (dict, optional) -- A dictionary of the intent section of the NLP hierarchy that is selected for NLP analysis. An example: { close_door: {} } where close_door is the intent. The intent belongs to the smart_home domain. If allowed_nlp_classes is None, we use the normal model predict functionality. locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character. language (str, optional) -- Language as specified using a 639-1/2 code. time_zone (str, optional) -- The name of an IANA time zone, such as 'America/Los_Angeles', or 'Asia/Kolkata' See the [tz database](https://www.iana.org/time-zones) for more information. timestamp (long, optional) -- A unix time stamp for the request (in seconds). dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. verbose (bool, optional) -- If True, returns class probabilities along with class prediction.
Returns:	A processed query object that contains the prediction results from applying the hierarchy of natural language processing models to the input text.
Return type:	(ProcessedQuery)

process_query(query, allowed_nlp_classes=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given query using the full hierarchy of natural language processing models trained for this application.

Parameters:	query (Query, or tuple) -- The user input query, or a list of the n-best transcripts query objects. allowed_nlp_classes (dict, optional) -- A dictionary of the intent section of the NLP hierarchy that is selected for NLP analysis. An example: `{'close_door': {}}` where close_door is the intent. The intent belongs to the smart_home domain. If allowed_nlp_classes is None, we use the normal model predict functionality. dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. verbose (bool, optional) -- If True, returns class probabilities along with class prediction.
Returns:	A processed query object that contains the prediction results from applying the full hierarchy of natural language processing models to the input query.
Return type:	(ProcessedQuery)

unload()[source]¶

intents¶: The intents supported within this domain (dict).

class mindmeld.components.nlp.EntityProcessor(app_path, domain, intent, entity_type, resource_loader=None, progress_bar=None)[source]¶

Bases: mindmeld.components.nlp.Processor

The entity processor houses the hierarchy of entity-specific natural language processing models required for analyzing a specific entity type in the user input.

domain¶: str -- The domain this entity belongs to.

intent¶: str -- The intent this entity belongs to.

type¶: str -- The type of this entity.

name¶: str -- The type of this entity.

role_classifier¶: RoleClassifier -- The role classifier for this entity type.

process_entity(query, entities, entity_index, allowed_nlp_classes, verbose=False)[source]¶

Processes the given entity using the hierarchy of natural language processing models trained for this entity type.

Parameters:

query (Query) -- The query the entity originated from.
entities (list) -- All entities recognized in the query.
entity_index (int) -- The index of the entity to process.
allowed_nlp_classes (dict) -- A dictionary of the NLP hierarchy that is selected for NLP analysis. An example: {'smart_home': {'close_door': {}}} where smart_home is the domain and close_door is the intent.
verbose (bool) -- If set to True, returns confidence scores of classes.

Returns:

Tuple containing: * ProcessedQuery: A processed query object that contains the prediction results from applying the hierarchy of natural language processing models to the input entity.

confidence_score: confidence scores returned by classifier.

Return type:

(tuple)

process_query(query, allowed_nlp_classes=None, dynamic_resource=None, verbose=False)[source]¶: Not implemented

resolve_entity(entity, aligned_entity_spans=None)[source]¶

Does the resolution of a single entity. If aligned_entity_spans is not None, the resolution leverages the n-best transcripts entity spans. Otherwise, it does the resolution on just the text of the entity.

Parameters:	entity (QueryEntity) -- The entity to process. aligned_entity_spans (list[QueryEntity]) -- The list of aligned n-best entity spans to improve resolution.
Returns:	The entity populated with the resolved values.
Return type:	(Entity)

unload()[source]¶

ready¶

class mindmeld.components.nlp.IntentProcessor(app_path, domain, intent, resource_loader=None, progress_bar=None)[source]¶

Bases: mindmeld.components.nlp.Processor

The intent processor houses the hierarchy of intent-specific natural language processing models required for understanding the user input for a particular intent.

domain¶: str -- The domain this intent belongs to.

name¶: str -- The name of this intent.

entity_recognizer¶: EntityRecognizer -- The entity recognizer for this intent.

get_entity_processors(label_set=None)[source]¶

process(query_text, allowed_nlp_classes=None, locale=None, language=None, time_zone=None, timestamp=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given input text using the hierarchy of natural language processing models trained for this intent.

Parameters:	query_text (str, list, tuple) -- The raw user text input, or a list of the n-best query transcripts from ASR. locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character. language (str, optional) -- Language as specified using a 639-1/2 code. time_zone (str, optional) -- The name of an IANA time zone, such as 'America/Los_Angeles', or 'Asia/Kolkata' See the [tz database](https://www.iana.org/time-zones) for more information. timestamp (long, optional) -- A unix time stamp for the request (in seconds). dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. verbose (bool, optional) -- If True, returns class as well as predict probabilities.
Returns:	A processed query object that contains the prediction results from applying the hierarchy of natural language processing models to the input text.
Return type:	(ProcessedQuery)

process_query(query, allowed_nlp_classes=None, dynamic_resource=None, max_ngram_search=3, verbose=False)[source]¶

Processes the given query using the hierarchy of natural language processing models trained for this intent.

Parameters:	query (Query, tuple) -- The user input query, or a list of the n-best transcripts query objects. allowed_nlp_classes (dict, optional) -- A dictionary of the NLP hierarchy that is selected for NLP analysis. An example: `{'smart_home': {'close_door': {}}}` where smart_home is the domain and close_door is the intent. dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. max_ngram_search (int, optional) -- The max n-gram number to process the query for search verbose (bool, optional) -- If `True`, returns class as well as predict probabilities.
Returns:	A processed query object that contains the prediction results from applying the hierarchy of natural language processing models to the input query.
Return type:	(ProcessedQuery)

unload()[source]¶

entities¶: The entity types associated with this intent (list).

nbest_transcripts_enabled¶: Whether or not to run processing on the n-best transcripts for this intent (bool).

class mindmeld.components.nlp.NaturalLanguageProcessor(app_path, resource_loader=None, config=None, progress_bar=None)[source]¶

Bases: mindmeld.components.nlp.Processor

The natural language processor is the MindMeld component responsible for understanding the user input using a hierarchy of natural language processing models.

domain_classifier¶: DomainClassifier -- The domain classifier for this application.

extract_nlp_masked_components_list(allow_nlp_components_list=None, deny_nlp_components_list=None)[source]¶

This function validates a user inputted list of allowed nlp components against the NLP hierarchy and construct a hierarchy dictionary as follows: {domain: {intent: {}} if the validation of list of allowed nlp components has passed.

Parameters:	allow_nlp_components_list (list) -- A list of allow NLP components in the format "domain.intent.entity.role". deny_nlp_components_list (list) -- A list of deny NLP components in the format "domain.intent.entity.role".
Returns:	A dictionary of NLP hierarchy.
Return type:	(dict)

inspect(markup, domain=None, intent=None, dynamic_resource=None)[source]¶

Inspect the marked up query and print the table of features and weights.

Parameters:	markup (str) -- The marked up query string. domain (str) -- The gold value for domain classification. intent (str) -- The gold value for intent classification. dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference.

static print_inspect_stats(stats)[source]¶: Prints formatted output matrix

process(query_text, allowed_nlp_classes=None, allowed_intents=None, allow_nlp=None, deny_nlp=None, locale=None, language=None, time_zone=None, timestamp=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given query using the full hierarchy of natural language processing models trained for this application.

Parameters:	query_text (str, tuple) -- The raw user text input, or a list of the n-best query transcripts from ASR. allowed_nlp_classes (dict, optional) -- A dictionary of the NLP hierarchy that is selected for NLP analysis. An example: `{'smart_home': {'close_door': {}}}` where smart_home is the domain and close_door is the intent. allowed_intents (list, optional) -- A list of allowed intents to use for the NLP processing. allow_nlp (list, optional) -- A list of allow NLP components to use for the NLP processing. deny_nlp (list, optional) -- A list of denied NLP components to use for the NLP processing. locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character. language (str, optional) -- Language as specified using a 639-1/2 code. This parameter is ignored deprecated this is an application level parameter. time_zone (str, optional) -- The name of an IANA time zone, such as 'America/Los_Angeles', or 'Asia/Kolkata' See the [tz database](https://www.iana.org/time-zones) for more information. timestamp (long, optional) -- A unix time stamp for the request (in seconds). dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. verbose (bool, optional) -- If True, returns class probabilities along with class prediction.
Returns:	A processed query object that contains the prediction results from applying the full hierarchy of natural language processing models to the input query.
Return type:	(ProcessedQuery)

process_query(query, allowed_nlp_classes=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given query using the full hierarchy of natural language processing models trained for this application.

Parameters:	query (Query, tuple) -- The user input query, or a list of the n-best transcripts query objects. allowed_nlp_classes (dict, optional) -- A dictionary of the NLP hierarchy that is selected for NLP analysis. An example: `{'smart_home': {'close_door': {}}}` where smart_home is the domain and close_door is the intent. If `allowed_nlp_classes` is `None`, we just use the normal model predict functionality. dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. verbose (bool, optional) -- If True, returns class probabilities along with class prediction.
Returns:	A processed query object that contains the prediction results from applying the full hierarchy of natural language processing models to the input query.
Return type:	(ProcessedQuery)

unload()[source]¶

domains¶: The domains supported by this application.

class mindmeld.components.nlp.Processor(app_path, resource_loader=None, config=None)[source]¶

Bases: abc.ABC

A generic base class for processing queries through the MindMeld NLP components.

resource_loader¶: ResourceLoader -- An object which can load resources for the processor.

dirty¶: bool -- Indicates whether the processor has unsaved changes to its models.

ready¶: bool -- Indicates whether the processor is ready to process messages.

build(incremental=False, label_set=None)[source]¶

Builds all the natural language processing models for this processor and its children.

Parameters:	incremental (bool, optional) -- When `True`, only build models whose training data or configuration has changed since the last build. Defaults to `False`. label_set (string, optional) -- The label set from which to train all classifiers.

create_query(query_text, locale=None, language=None, time_zone=None, timestamp=None)[source]¶

Creates a query with the given text.

Parameters:	query_text (str, list[str]) -- Text or list of texts to create a query object for. locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character. language (str, optional) -- Language as specified using a 639-1/2 code. time_zone (str, optional) -- The name of an IANA time zone, such as 'America/Los_Angeles', or 'Asia/Kolkata' See the [tz database](https://www.iana.org/time-zones) for more information. timestamp (long, optional) -- A unix time stamp for the request (in seconds).
Returns:	A newly constructed query or tuple of queries.
Return type:	(Query)

dump()[source]¶: Saves all the natural language processing models for this processor and its children to disk.

evaluate(print_stats=False, label_set=None)[source]¶

Evaluates all the natural language processing models for this processor and its children.

Parameters:	print_stats (bool) -- If true, prints the full stats table. Otherwise prints just the accuracy label_set (str, optional) -- The label set from which to evaluate all classifiers.

load(incremental_timestamp=None)[source]¶

Loads all the natural language processing models for this processor and its children from disk.

Parameters:	incremental_timestamp (str, optional) -- The incremental timestamp value.

process(query_text, allowed_nlp_classes=None, locale=None, language=None, time_zone=None, timestamp=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given query using the full hierarchy of natural language processing models trained for this application.

Parameters:	query_text (str, tuple) -- The raw user text input, or a list of the n-best query transcripts from ASR. allowed_nlp_classes (dict, optional) -- A dictionary of the NLP hierarchy that is selected for NLP analysis. An example: `{'smart_home': {'close_door': {}}}` where smart_home is the domain and close_door is the intent. locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character. language (str, optional) -- Language as specified using a 639-1/2 code. This parameter is deprecated deprecated this is an application level parameter. time_zone (str, optional) -- The name of an IANA time zone, such as 'America/Los_Angeles', or 'Asia/Kolkata' See the [tz database](https://www.iana.org/time-zones) for more information. timestamp (long, optional) -- A unix time stamp for the request (in seconds). dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference. verbose (bool, optional) -- If True, returns class probabilities along with class prediction.
Returns:	A processed query object that contains the prediction results from applying the full hierarchy of natural language processing models to the input query.
Return type:	(ProcessedQuery)

process_query(query, allowed_nlp_classes=None, dynamic_resource=None, verbose=False)[source]¶

Processes the given query using the full hierarchy of natural language processing models trained for this application.

Parameters:	query (Query, tuple) -- The user input query, or a list of the n-best transcripts query objects. allowed_nlp_classes (dict, optional) -- A dictionary of the NLP hierarchy that is selected for NLP analysis. An example: `{'smart_home': {'close_door': {}}}` where smart_home is the domain and close_door is the intent. dynamic_resource (dict, optional) -- A dynamic resource to aid NLP inference verbose (bool, optional): If True, returns class probabilities along with class prediction.
Returns:	A processed query object that contains the prediction results from applying the full hierarchy of natural language processing models to the input query.
Return type:	(ProcessedQuery)

unload()[source]¶

incremental_timestamp¶: The incremental timestamp of this processor (str).

instance_map = <WeakValueDictionary>¶: The map of identity to instance.

mindmeld.components.nlp.restart_subprocesses()[source]¶: Restarts the process pool executor

mindmeld.components.nlp.subproc_call_instance_function(instance_id, func_name, *args, **kwargs)[source]¶

A module function used as a trampoline to call an instance function from within a long running child process.

Parameters:	instance_id (number) -- id(inst) of the Processor instance that needs called
Returns:	The result of the called function