File To Text#
File to Text Processor is a processor that converts the contents of a file into plain text. It enables applications to extract textual data from various file formats, making it accessible for further processing or analysis.
Supported Input Port:
filepath: The File to Text Processor accepts input through the “filepath” port. The input should be a string representing the file path of the file to be converted to text.
Supported Output Port:
text: The processor produces output through the “text” port. The output is a string containing the extracted text from the input file.
List of Implementations:#
Langchain Implementation#
The Langchain implementation of the File to Text Processor utilizes the Langchain library to convert files to text.
Metadata
Sample processor configuration:#
NOTE: Processor is always added to a module(Input or Output). The module is then added to the pipeline.
{
"processor_type": "file_to_text",
"processor_implementation_type": "file_to_text_with_s3",
"input_port": "filepath",
"output_port": "filepath",
"metadata": {},
}