Natural language processing ########################### ``BytePairEncoder`` ------------------- .. autoclass:: numpy_ml.preprocessing.nlp.BytePairEncoder :members: :undoc-members: :inherited-members: ``HuffmanEncoder`` ------------------ .. autoclass:: numpy_ml.preprocessing.nlp.HuffmanEncoder :members: :undoc-members: :inherited-members: ``TFIDFEncoder`` ------------------ .. autoclass:: numpy_ml.preprocessing.nlp.TFIDFEncoder :members: :undoc-members: :inherited-members: ``Vocabulary`` -------------- .. autoclass:: numpy_ml.preprocessing.nlp.Vocabulary :members: :undoc-members: :inherited-members: ``Token`` --------- .. autoclass:: numpy_ml.preprocessing.nlp.Token :members: :undoc-members: :inherited-members: ``ngrams`` ----------- .. autofunction:: numpy_ml.preprocessing.nlp.ngrams ``remove_stop_words`` --------------------- .. autofunction:: numpy_ml.preprocessing.nlp.remove_stop_words ``strip_punctuation`` --------------------- .. autofunction:: numpy_ml.preprocessing.nlp.strip_punctuation ``tokenize_words`` ------------------- .. autofunction:: numpy_ml.preprocessing.nlp.tokenize_words ``tokenize_whitespace`` ------------------------ .. autofunction:: numpy_ml.preprocessing.nlp.tokenize_whitespace ``tokenize_chars`` ------------------- .. autofunction:: numpy_ml.preprocessing.nlp.tokenize_chars ``tokenize_bytes_raw`` ----------------------- .. autofunction:: numpy_ml.preprocessing.nlp.tokenize_bytes_raw ``bytes_to_chars`` ----------------------- .. autofunction:: numpy_ml.preprocessing.nlp.bytes_to_chars