Transformer Classifier

TransformerClassifier base on transformers library. This is a wrapper of transformers.AutoModelForSequenceClassification, language model should be one of shortcut in transformers pretrained models or using one in ['vinai/phobert-base', 'vinai/phobert-large']

TransformerClassifier(num_labels=3, language_model_shortcut='vinai/phobert-base', device='cuda')
class sentivi.classifier.TransformerClassifier(num_labels: Optional[int] = 3, language_model_shortcut: Optional[str] = 'vinai/phobert', freeze_language_model: Optional[bool] = True, batch_size: Optional[int] = 2, warmup_steps: Optional[int] = 100, weight_decay: Optional[float] = 0.01, accumulation_steps: Optional[int] = 50, save_steps: Optional[int] = 100, learning_rate: Optional[float] = 3e-05, device: Optional[str] = 'cpu', optimizer=None, criterion=None, num_epochs: Optional[int] = 10, num_workers: Optional[int] = 2, *args, **kwargs)
class TransformerDataset(batch_encodings, labels)
__init__(batch_encodings, labels)

Initialize transformer dataset

Parameters
  • batch_encodings

  • labels

class TransformerPredictedDataset(batch_encodings)
__init__(batch_encodings)

Initialize transformer dataset

Parameters

batch_encodings

__init__(num_labels: Optional[int] = 3, language_model_shortcut: Optional[str] = 'vinai/phobert', freeze_language_model: Optional[bool] = True, batch_size: Optional[int] = 2, warmup_steps: Optional[int] = 100, weight_decay: Optional[float] = 0.01, accumulation_steps: Optional[int] = 50, save_steps: Optional[int] = 100, learning_rate: Optional[float] = 3e-05, device: Optional[str] = 'cpu', optimizer=None, criterion=None, num_epochs: Optional[int] = 10, num_workers: Optional[int] = 2, *args, **kwargs)

Initialize TransformerClassifier instance

Parameters
  • num_labels – number of polarities

  • language_model_shortcut – language model shortcut

  • freeze_language_model – whether language model is freeze or not

  • batch_size – training batch size

  • warmup_steps – learning rate warm up step

  • weight_decay – learning rate weight decay

  • accumulation_steps – optimizer accumulation step

  • save_steps – saving step

  • learning_rate – training learning rate

  • device – training and evaluating rate

  • optimizer – training optimizer

  • criterion – training criterion

  • num_epochs – maximum number of epochs

  • num_workers – number of DataLoader workers

  • args – arbitrary arguments

  • kwargs – arbitrary keyword arguments

forward(data, *args, **kwargs)

Training and evaluating TransformerClassifier instance

Parameters
  • data – TransformerTextEncoder output

  • args – arbitrary arguments

  • kwargs – arbitrary keyword arguments

Returns

training and evaluating results

Return type

str

get_overall_result(loader)

Get overall result

Parameters

loader – DataLoader

Returns

overall result

Return type

str

load(model_path, *args, **kwargs)

Load model from disk

Parameters
  • model_path – path to model path

  • args – arbitrary arguments

  • kwargs – arbitrary keyword arguments

Returns

predict(X, *args, **kwargs)

Predict polarities with given list of sentences

Parameters
  • X – list of sentences

  • args – arbitrary arguments

  • kwargs – arbitrary keyword arguments

Returns

list of polarities

Return type

str

save(save_path, *args, **kwargs)

Save model to disk

Parameters
  • save_path – path to saved model

  • args – arbitrary arguments

  • kwargs – arbitrary keyword arguments

Returns