export const training_instructions_ner = `
In a ZIP folder, please include two folders, called train and val, and two JSON files, called label_id.json & model_config.json, using the formats below.



1) train & val folders format:

The train and val folders should each contain two TXT files called sentences.txt and tags.txt.

<< sentences.txt >>
The sentences.txt file contains the sentences, one per line. e.g.,
This is an example PersonFirstName PersonLastName .
This is an example Location .
…

<< tags.txt >>
The tags.txt file contains the corresponding tags, following the BIO tagging format, in the same order as the sentences.txt file. Each line in the tags.txt file corresponds to the tag sequence for the corresponding line in the sentences.txt file. e.g.,
O O O O B-PER I-PER O
O O O O B-LOC O
…

The tag count each line in tags.txt should exactly match the word (or token) count in the corresponding line from sentences.txt. The tags should follow the BIO tagging format (short for Beginning, Inside, Outside), where:
- The B-prefix before a tag indicates that the tag is the beginning of an entity, and an I-prefix before a tag indicates that the tag is inside an entity.
- An O tag indicates that a token belongs to no entity.
- The I-tag is used only when a tag is followed by a tag of the same entity without O tokens between them. For example,
New York is a city => B-LOC I-LOC O O O
Paris is a city => B-LOC O O O



2) label_id.json format indicating a mapping from tags IDs to tags names. e.g. (for the PER, MISC, ORG and LOC tags),

{
    “0”: “B-LOC”,
    “1”: “B-MISC”,
    “2”: “B-ORG”,
    “3”: “B-PER”,
    “4”: “I-LOC”,
    “5”: “I-MISC”,
    “6”: “I-ORG”,
    “7”: “I-PER”,
    “8”: “O”
}



3) model_config.json to configure the model parameters (see documentation for more explanation) with accepted fields. e.g.,

{
    “LEARNING_RATE”: 1e-3,
    “NUM_EPOCHS”: 10,
    “NO_PRETRAINED_ENCODER”: false,
    “LANGUAGE”: “english”
}



Please contact support@slicex.ai if you have any question.
`