In this post, Iāll share a detailed guide on how to set up an automated document workflow using Tape, MacOS Folder Actions, a self-hosted PDF-to-text service, and OpenAI. The solution streamlines document handling, categorisation, and data extraction, making your workflow faster and smarter.
Video
Here is a video that shows all the steps to this example, you may notice my voice has changed I am losing mine at the moment so you can thank ElevenLabs for the softer tones.
Overview of the Workflow
The automation performs the following steps:
- A file is added to a Watch Folder on macOS.
- The file is uploaded to Tape using a Folder Action and AppleScript.
- Tape creates a new record and attaches the file.
- A self-hosted PDF service converts the document into plain text via an API call.
- Tape updates the record and sends the plain text to OpenAI for:
- Key data extraction
- Summary generation
- Title creation
- Categorisation
- Direction assignment (incoming or outgoing)
- The OpenAI response updates the Tape record, adding a formatted HTML table with key information.
- All fields, including the searchable plain text version of the PDF, are finalised.
This process is completed within seconds, reducing manual effort significantly.
Step-by-Step Guide
1. Setting Up MacOS Folder Actions
Folder Actions allow you to run a script whenever a file is added to a specified folder.
Hereās a sample AppleScript you can use:
on adding folder items to this_folder after receiving added_items
-- display dialog "Folder action triggered!"
repeat with each_item in added_items
-- Get the POSIX path of the file
set file_path to POSIX path of each_item
-- Webhook URL
set webhook_url to https://your-tape-webhook-url"
-- Construct the curl command
set curl_command to "/usr/bin/curl -v -X POST -F file=@" & quoted form of file_path & " -F \"name=test\" " & webhook_url
try
-- Execute the curl command
set response to do shell script curl_command
-- Log the server response
-- display dialog "Response: " & response
tell application "Finder" to delete each_item
on error error_message
-- Log errors
display dialog "Error: " & error_message
end try
end repeat
end adding folder items to
2. Configuring Tape Automations
Tape automations handle the workflow once the file is uploaded.
Automation 1: Record Creation
-
Trigger: File is uploaded via the webhook.
-
Action: Create a record and attach the file.
Automation 2: PDF-to-Text Conversion -
Trigger: Record is created with a PDF file.
-
Action: Make an API call to your PDF service and store the plain text in a hidden field.
3. Using OpenAI for Advanced Processing
API Call to OpenAI
Tape sends the plain text to OpenAI with specific instructions in the system prompt. Hereās an example:
const { data: openapi_response } = await http.post('https://api.openai.com/v1/chat/completions', {
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${var_openai_key}`,
},
data: {
model: var_ai_model,
messages: [
{ role: "system", content: var_system },
{ role: "user", content: var_user },
],
temperature: 0.7,
},
});
The response includes:
- A summary of the document.
- A title for the record.
- The document type (category).
- Key-value pairs for additional data.
We can extract that information with:
const { summary, title, docType, additional, direction } = content;
4. Updating the Tape Record
The final part updates the Tape record:
- Populates fields with OpenAIās response.
- Adds a formatted HTML table for key information.
- Stores a searchable plain text version of the document.
Hereās an example of the HTML table:
let htmlTable = `
<h2 style="font-size: 1.45em; font-family: Quicksand; color: #1B98A6;">Key Information</h2>
<table style="${styles.table}">
<tr>
<th style="${styles.th}">Key</th>
<th style="${styles.th}">Value</th>
</tr>`;
// Dynamically build table rows
Object.entries(additional).forEach(([key, value]) => {
htmlTable += `<tr>
<td style="${styles.td}">${key}</td>
<td style="${styles.td}">${value}</td>
</tr>`;
});
htmlTable += '</table>'; // Close the table
Challenges and Solutions
Adding New Categories Dynamically
If OpenAI suggests a category that doesnāt already exist, Tape creates it dynamically by updating the appās configuration. In my example, I donāt pass the current categories to OpenAI with the extraction request, this was because I wanted to allow it āfree reignā to start with however in the long term I would recommend passing the current categories and requesting that AI uses those if possible and only suggesting a new one if required.
I think I have posted before about automating category management, however some example code:
const appData = await tape.App.get(41954);
const typeList = jsonata(`data.fields[field_id=389515].config.settings.options[].text`).evaluate(appData);
if (!typeList?.includes(docType)) {
await tape.App.update(41954, {
fields: [{
field_id: 389515,
config: {
label: "Doc Type",
settings: { options: [{ text: docType }] },
},
}],
});
console.info(`Added new Doc Type: ${docType}`);
} else {
console.info(`Doc Type already exists: ${docType}`);
}
Debugging and Progress Tracking
Comments are added to the record at each step, helping users track automation progress.
Benefits of This Workflow
- Time-Saving: Automations eliminate the need for manual uploads, data extraction, and categorisation.
- Searchable Records: The plain text field ensures documents are fully searchable within Tape.
a. This also means that when I send record details to Vector Storage it makes the full text available to my AI assistants. - Scalability: Additional workflows or integrations can be added as needed.
Conclusion
This automated workflow combines the best of system integrations, APIs, and AI to deliver a powerful document-processing solution. Itās highly customisable, so you can adapt it to suit your specific requirements.
If you have any questions or run into issues, feel free to ask here, and Iād be happy to help!