PDF Upload to Scopevisio Results in Corrupted Files (Base64 Encoding Issue)

Hans · December 18, 2024, 6:13pm

I want to automate the process of sending a PDF file, which is available in my Tape environment (via the link invoice_field_rechnungen_file_url from Filefield Rechnungen in Tape), to the Scopevisio service. The process involves the following steps:

Downloading the PDF file from Tape:

I have a link (invoice_field_rechnungen_file_url) pointing to a PDF file. This file should be retrieved by my script to obtain the raw content of the PDF.

Converting the downloaded PDF file to Base64:

Scopevisio expects the file to be sent as a Base64-encoded string, not as a URL or binary data. Therefore, I need to convert the downloaded PDF file into a Base64 string.

Sending the Base64-encoded content to Scopevisio:

Using an API request (HTTP POST), I want to send the Base64-encoded string along with the filename to Scopevisio. The service should then create a new incoming invoice and attach the provided PDF as the receipt.

Problem:

The file is successfully uploaded to Scopevisio, but it cannot be opened because it is corrupted. However, files uploaded via Postman are readable and correctly displayed in Scopevisio.

In summary:

I want to ensure that the PDF file is not simply forwarded but properly transformed before being sent to Scopevisio, so that Scopevisio can recognize, store, and display the content as a readable PDF. The challenge lies in correctly downloading the PDF as binary data, encoding it to Base64, and formatting the JSON payload properly for the Scopevisio API.

{
  // Zuerst das PDF von der URL laden (binär!)
  const pdfResponse = await http.get(invoice_field_rechnungen_file_url, {
    // falls möglich eine Option setzen, um Binärdaten zu erhalten
    // responseType: 'arraybuffer', encoding: null, o.ä.
  });

  if (!pdfResponse || !pdfResponse.data) {
    throw new Error('Konnte die PDF nicht herunterladen.');
  }

  // pdfResponse.data nun in einen Buffer umwandeln, falls noch keiner ist
  let bufferData;
  if (Buffer.isBuffer(pdfResponse.data)) {
    bufferData = pdfResponse.data;
  } else {
    bufferData = Buffer.from(pdfResponse.data, 'binary');
  }

  // In Base64 umwandeln
  const base64Data = bufferData.toString('base64');

  // assemble headers
  const request_headers = {
    'Authorization': `Bearer 9cf8bbbccccccccb-e0c6e4-cccccccccccab85d5dd2bf`,
    'Content-Type': 'application/json'
  };

  // Jetzt die base64-Daten statt der URL übergeben
  const request_body = JSON.stringify({
    filename: invoice_field_rechnungen_filename || "",
    data: base64Data
  });

  const request_options = { 
    followRedirects: false, 
    data: request_body, 
    headers: request_headers 
  };

  const httpResult = await http.post('https://appload.scopevisio.com/rest/incominginvoice/new', request_options);
  var_http_response = httpResult.data;
}

Her the cURL command for Scopevisio:

curl -X 'POST' \
  'https://appload.scopevisio.com/rest/incominginvoice/new' \
  -H 'accept: */*' \
  -H 'Content-Type: application/json' \
  -d '{
  "filename": "string",
  "data": "string"
}'

Required profiles: Rechnungseingangsbuch (Bearbeiten) .
The invoice should be provided as a base64 encoded pdf file. The maximum size allowed is 30 MB.

Does anyone have an idea how I can solve this?

Greetings Hans

Hans · December 18, 2024, 7:37pm

This script works:

// Download the PDF file from the external URL
const url = invoice_field_rechnungen_file_url;
const { data: file_content } = await http.get(url, { responseEncoding: 'binary' });

// Convert the binary data to a Buffer and then to Base64
const bufferData = Buffer.from(file_content, 'binary');
const base64Data = bufferData.toString('base64');

// Send the Base64 content to Scopevisio
const filename = invoice_field_rechnungen_filename || 'unknown.pdf';
const request_headers = {
  'Authorization': `Bearer 9cf8bbbb-e0c6c9bd-6fba-4658-a1e4-aab85d5dd2bf`,
  'Content-Type': 'application/json'
};

const request_body = JSON.stringify({
  filename: filename,
  data: base64Data
});

const request_options = { 
  followRedirects: false, 
  data: request_body, 
  headers: request_headers 
};

const httpResult = await http.post('https://appload.scopevisio.com/rest/incominginvoice/new', request_options);
const var_http_response = httpResult.data;

// Log response
console.log('Response from Scopevisio:', JSON.stringify(var_http_response, null, 2));

The main difference between the two scripts lies in the handling and processing of the HTTP response data, as well as how the data is converted to Base64. Let’s take a closer look at the differences:

1. Downloading the PDF Data

In the first script:

const pdfResponse = await http.get(invoice_field_rechnungen_file_url, {
  // if possible, set an option to get binary data
  // responseType: 'arraybuffer', encoding: null, etc.
});

The specific configuration for downloading binary data is missing here. Without the correct option (responseEncoding: 'binary'), the http.get method might return data in a non-binary format, leading to issues later.

In the second script:

const { data: file_content } = await http.get(url, { responseEncoding: 'binary' });

The option responseEncoding: 'binary' is explicitly specified, ensuring that the response data is returned as binary data.

2. Converting the Data to a Buffer

In the first script:

let bufferData;
if (Buffer.isBuffer(pdfResponse.data)) {
  bufferData = pdfResponse.data;
} else {
  bufferData = Buffer.from(pdfResponse.data, 'binary');
}

Here, it checks whether the data is already a Buffer. If not, it creates a new Buffer. While this is correct, it is unnecessary if the data format is ensured to be correct from the beginning.

In the second script:

const bufferData = Buffer.from(file_content, 'binary');

The data is directly converted into a Buffer without additional checks, making it simpler and more efficient.

3. Base64 Conversion

Both scripts use the same approach to convert the data to Base64:

const base64Data = bufferData.toString('base64');

4. Missing or Incorrect Options

The first script does not explicitly specify the responseEncoding option, leading to potential issues when the response data is not processed correctly.

Conclusion

The main difference is that the first script does not provide the correct options for downloading and processing the binary data (responseEncoding: 'binary'). This can cause issues when the data is not in the expected format. The second script avoids these problems by clearly specifying the appropriate options and using a simpler logic for data processing.