Ingestion of Voice Cases
Updated
Voice Data Import Guidelines:
Adhere to the following specifications for a successful import of voice data into Sprinklr:
File Directory:
Maintain a single, consistent directory where Sprinklr configuration is searched for files.
Ensure a one-to-one mapping between a specific importer configuration and a directory. No two importer configurations can point to the same directory.
File Extension and Content:
Sprinklr currently supports only zip or tar.gz files for voice data.
Each zip file should contain one metadata file and one audio call recording.
The CSV in the zip file should have exactly two lines/rows: the first for headers and the second for values.
File Prefix:
Maintain a consistent prefix for zip files. The configuration searches for zip files with this consistent prefix (For example, Partner_Call_1.zip, Partner_Call_2.zip).
Inside zip/tar, if Sprinklr does not find any metadata or recording file with expected extensions, the entire zip/tar file gets ignored.
Metadata File Extension:
Sprinklr currently supports three types of metadata file extensions: JSON, XML, and CSV. Partners must adhere to these extensions for all selected files.
Metadata Contents:
Every conversation must contain a unique conversation ID for system identification.
Ensure consistency in metadata contents across all cases for a single configuration.
In the absence of an agent ID in the config, the system assigns the case to the default bot user shared across partners.
The system does not currently support the import of customer profiles. If the customer ID is present, a new profile is created; otherwise, the case is associated with a backend-created profile consistent across partners.
Only one value of an agent associated with a single case is currently supported.
If the call start time is not present in the metadata, the current time of upload is considered the call start time.
For CSV metadata files, it should have exactly two lines/rows: the first for headers and the second for values.
Audio Files:
Supported formats: WAV, MP3, OPUS with a frequency of 8000 Hz or more, and stereo.
PCM Formats (Choose one):
Pcm_s16le
PCM A-law
PCM mu-law
G.729a
Optimizing for ASR Models:
Ensure audio files are representative and mimic actual production data and use cases for optimal ASR (Automatic Speech Recognition) model performance.
Include recordings with different speakers, providing a good distribution for both male and female genders.
Include sample calls/audios for each intent/use case.
Ensure availability of different speakers from various locations or with different accents.
Processing Time:
On average, it takes around five seconds to create a case with a five minute conversation between a customer and an agent via SFTP.
Sample Metadata File: