What is the issue with the provided code related to the undefined 'steps'?

user-1 · April 5, 2024, 12:10pm

Im running the below code and getting this error,

import { axios } from "@pipedream/platform";
import PDFParser from "pdf2json";

export default defineComponent({
  async run({ steps, $ }) {
    const pdfParser = new PDFParser();

    pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
    pdfParser.on("pdfParser_dataReady", pdfData => {
      const text = pdfData.formImage.Pages.reduce((acc, page) => {
        return acc + page.Texts.reduce((accText, textItem) => {
          // Decode URI component and replace any additional encoding specific to pdf2json
          const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'");
          return accText + decodedText + " ";
        }, "");
      }, "");
      
      console.log(text); // This will log the extracted text to the Pipedream logs
      return text; // This will return the extracted text as the step output
    });

    pdfParser.loadPDF("/tmp/data.pdf");
  },
});

user-1 · April 5, 2024, 12:10pm

It seems like you’re trying to parse PDF data using the pdf2json library in a Pipedream component. However, you’re encountering an error that isn’t specified in your message. To better assist you, I’d need to know the exact error message you’re receiving. However, I can offer some general advice and a corrected version of your code that might help resolve common issues. First, ensure that the PDF file you’re trying to parse (/tmp/data.pdf) exists in the /tmp directory within your Pipedream environment. Remember, /tmp is the only writable and readable directory in Pipedream’s serverless environment. Second, since you’re using an external library (pdf2json), make sure it’s compatible with Pipedream’s Node.js environment. While Pipedream supports a wide range of npm packages, there might be specific limitations or compatibility issues with some packages. Here’s a revised version of your code with some additional error handling that might help identify the issue: import { axios } from "@pipedream/platform"; import PDFParser from "pdf2json"; export default defineComponent({ async run({ steps, $ }) { const pdfParser = new PDFParser(this, null); pdfParser.on("pdfParser_dataError", errData => console.error("PDF Parse Error:", errData.parserError)); pdfParser.on("pdfParser_dataReady", pdfData => { const text = pdfData.formImage.Pages.reduce((acc, page) => { return acc + page.Texts.reduce((accText, textItem) => { // Decode URI component and replace any additional encoding specific to pdf2json const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'"); return accText + decodedText + " "; }, ""); }, ""); console.log(text); // This will log the extracted text to the Pipedream logs return text; // This will return the extracted text as the step output }); try { pdfParser.loadPDF("/tmp/data.pdf"); } catch (error) { console.error("Error loading PDF:", error); throw new Error(`Failed to load PDF: ${error.message}`); } }, }); Please note the addition of a try...catch block around pdfParser.loadPDF to catch and log any errors that occur when attempting

user-1 · April 5, 2024, 12:10pm

Im running the below code and getting this error,
ACTIVE_HANDLEThis step was still trying to run code when the step ended. Make sure you promisify callback functions and await all Promises. (Reason: FSReqPromise, Learn more: https://pipedream.com/docs/code/nodejs/async/)

import { axios } from "@pipedream/platform";
import PDFParser from "pdf2json";

export default defineComponent({
  async run({ steps, $ }) {
    const pdfParser = new PDFParser();

    pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError));
    pdfParser.on("pdfParser_dataReady", pdfData => {
      const text = pdfData.formImage.Pages.reduce((acc, page) => {
        return acc + page.Texts.reduce((accText, textItem) => {
          // Decode URI component and replace any additional encoding specific to pdf2json
          const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'");
          return accText + decodedText + " ";
        }, "");
      }, "");
      
      console.log(text); // This will log the extracted text to the Pipedream logs
      return text; // This will return the extracted text as the step output
    });

    pdfParser.loadPDF("/tmp/data.pdf");
  },
});

user-1 · April 5, 2024, 12:10pm

The error you’re encountering, ACTIVE_HANDLEThis step was still trying to run code when the step ended. Make sure you promisify callback functions and await all Promises., suggests that your code is using asynchronous operations that are not properly awaited or promisified. This is a common issue when dealing with event-based or callback-based asynchronous operations, such as those used by pdf2json. In your specific case, the pdfParser operations are event-driven and not awaited within the run method, leading to the step completing before your asynchronous operations have finished. To resolve this, you need to wrap your pdfParser operations in a Promise and await that Promise within the run method. This ensures that the step waits for the pdfParser operations to complete before moving on. Here’s how you can modify your code to use a Promise: import { axios } from "@pipedream/platform"; import PDFParser from "pdf2json"; export default defineComponent({ async run({ steps, $ }) { const pdfParser = new PDFParser(); // Wrap the pdfParser operations in a Promise const textPromise = new Promise((resolve, reject) => { pdfParser.on("pdfParser_dataError", errData => reject(errData.parserError)); pdfParser.on("pdfParser_dataReady", pdfData => { const text = pdfData.formImage.Pages.reduce((acc, page) => { return acc + page.Texts.reduce((accText, textItem) => { // Decode URI component and replace any additional encoding specific to pdf2json const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'"); return accText + decodedText + " "; }, ""); }, ""); resolve(text); // Resolve the Promise with the extracted text }); }); pdfParser.loadPDF("/tmp/data.pdf"); // Await the Promise to ensure the step waits for the pdfParser operations to complete const extractedText = await textPromise; console.log(extractedText); // This will log the extracted text to the Pipedream logs return extractedText; // This will return the extracted text as the step output }, }); By wrapping the pdfParser operations in a Promise and using resolve to return the extracted text when the pdfParser_dataReady event

user-1 · April 5, 2024, 12:10pm

i m running below code and get this error,

Code Error
could not compute configurable props

This step was still trying to run code when the step ended. Make sure you promisify callback functions and await all Promises. (Reason: FSReqPromise, Learn more: https://pipedream.com/docs/code/nodejs/async/)

import { axios } from "@pipedream/platform";
import PDFParser from "pdf2json";

export default defineComponent({
  async run({ steps, $ }) {
    const pdfParser = new PDFParser();

    // Wrap the pdfParser operations in a Promise
    const textPromise = new Promise((resolve, reject) => {
      pdfParser.on("pdfParser_dataError", errData => reject(errData.parserError));
      pdfParser.on("pdfParser_dataReady", pdfData => {
        const text = pdfData.formImage.Pages.reduce((acc, page) => {
          return acc + page.Texts.reduce((accText, textItem) => {
            // Decode URI component and replace any additional encoding specific to pdf2json
            const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'");
            return accText + decodedText + " ";
          }, "");
        }, "");
        
        resolve(text); // Resolve the Promise with the extracted text
      });
    });

    pdfParser.loadPDF("/tmp/data.pdf");

    // Await the Promise to ensure the step waits for the pdfParser operations to complete
    const extractedText = await textPromise;
    console.log(extractedText); // This will log the extracted text to the Pipedream logs
    return extractedText; // This will return the extracted text as the step output
  },
});

user-1 · April 5, 2024, 12:10pm

i m running below code and get this error,

Code Error
could not compute configurable props

This step was still trying to run code when the step ended. Make sure you promisify callback functions and await all Promises. (Reason: FSReqPromise, Learn more: https://pipedream.com/docs/code/nodejs/async/)

import { axios } from "@pipedream/platform";
import PDFParser from "pdf2json";

export default defineComponent({
  async run({ steps, $ }) {
    const pdfParser = new PDFParser();

    // Wrap the pdfParser operations in a Promise
    const textPromise = new Promise((resolve, reject) => {
      pdfParser.on("pdfParser_dataError", errData => reject(errData.parserError));
      pdfParser.on("pdfParser_dataReady", pdfData => {
        const text = pdfData.formImage.Pages.reduce((acc, page) => {
          return acc + page.Texts.reduce((accText, textItem) => {
            // Decode URI component and replace any additional encoding specific to pdf2json
            const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'");
            return accText + decodedText + " ";
          }, "");
        }, "");
        
        resolve(text); // Resolve the Promise with the extracted text
      });
    });

    pdfParser.loadPDF("/tmp/data.pdf");

    // Await the Promise to ensure the step waits for the pdfParser operations to complete
    const extractedText = await textPromise;
    console.log(extractedText); // This will log the extracted text to the Pipedream logs
    return extractedText; // This will return the extracted text as the step output
  },
});

user-1 · April 5, 2024, 12:10pm

Based on the error message you’ve received, it seems like your code is encountering issues with asynchronous operations, specifically with the pdfParser.loadPDF method not completing before the step ends. This is a common issue when dealing with asynchronous code in Node.js, as highlighted in the Pipedream documentation on asynchronous code in Node.js. To address this issue, you should ensure that all asynchronous operations are properly awaited or promisified. In your case, the pdfParser.loadPDF method should also be included within the Promise to ensure it completes before proceeding. Here’s a revised version of your code that ensures pdfParser.loadPDF is called within the Promise, which should resolve the error: import { axios } from "@pipedream/platform"; import PDFParser from "pdf2json"; export default defineComponent({ async run({ steps, $ }) { const pdfParser = new PDFParser(); // Wrap the pdfParser operations, including loadPDF, in a Promise const textPromise = new Promise((resolve, reject) => { pdfParser.on("pdfParser_dataError", errData => reject(errData.parserError)); pdfParser.on("pdfParser_dataReady", pdfData => { const text = pdfData.formImage.Pages.reduce((acc, page) => { return acc + page.Texts.reduce((accText, textItem) => { // Decode URI component and replace any additional encoding specific to pdf2json const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'"); return accText + decodedText + " "; }, ""); }, ""); resolve(text); // Resolve the Promise with the extracted text }); // Move the loadPDF call inside the Promise to ensure it's part of the async flow pdfParser.loadPDF("/tmp/data.pdf"); }); // Await the Promise to ensure the step waits for the pdfParser operations to complete const extractedText = await textPromise; console.log(extractedText); // This will log the extracted text to the Pipedream logs return extractedText; // This will return the extracted text as the step output }, }); This modification ensures that the pdfParser.loadPDF method is included in the asynchronous operation, which should prevent the step from ending before the PDF

user-1 · April 5, 2024, 12:10pm

Im running below code and get this message, whats wrong?

TypeError
Cannot read properties of undefined (reading 'Pages')



import { axios } from "@pipedream/platform";
import PDFParser from "pdf2json";

export default defineComponent({
  async run({ steps, $ }) {
    const pdfParser = new PDFParser();

    // Wrap the pdfParser operations, including loadPDF, in a Promise
    const textPromise = new Promise((resolve, reject) => {
      pdfParser.on("pdfParser_dataError", errData => reject(errData.parserError));
      pdfParser.on("pdfParser_dataReady", pdfData => {
        const text = pdfData.formImage.Pages.reduce((acc, page) => {
          return acc + page.Texts.reduce((accText, textItem) => {
            // Decode URI component and replace any additional encoding specific to pdf2json
            const decodedText = decodeURIComponent(textItem.R[0].T).replace(/\\'/g, "'");
            return accText + decodedText + " ";
          }, "");
        }, "");
        
        resolve(text); // Resolve the Promise with the extracted text
      });

      // Move the loadPDF call inside the Promise to ensure it's part of the async flow
      pdfParser.loadPDF("/tmp/data.pdf");
    });

    // Await the Promise to ensure the step waits for the pdfParser operations to complete
    const extractedText = await textPromise;
    console.log(extractedText); // This will log the extracted text to the Pipedream logs
    return extractedText; // This will return the extracted text as the step output
  },
});

user-1 · April 5, 2024, 12:10pm

The error message you’re encountering, TypeError: Cannot read properties of undefined (reading 'Pages'), suggests that the pdfData.formImage.Pages object you’re trying to access is not structured as expected. This could be due to several reasons, such as the PDF file not being loaded correctly, or the structure of the pdfData object not matching what you anticipate. However, your code does not directly relate to the error message concerning ‘Pages’, as it does not involve any operation related to a ‘Pages’ property directly in the provided code snippet. The issue might be stemming from how the pdf2json library parses the PDF file and structures its output, specifically the pdfData.formImage.Pages part of your code. To troubleshoot this issue, you might want to: 1. Ensure that the PDF file /tmp/data.pdf exists and is accessible at the time pdfParser.loadPDF("/tmp/data.pdf"); is called. 2. Verify that the PDF file is not corrupt and is in a format that pdf2json can successfully parse. 3. Log the pdfData object before accessing pdfData.formImage.Pages to inspect its structure and confirm it contains the Pages property as expected. If you continue to face issues, consider reaching out for more support at Support - Pipedream.

user-1 · April 5, 2024, 12:10pm

Im running below code but it throws an error

import { axios } from “@pipedream/platform”;
import pdf-extract from “pdf-extract”;

var inspect = require(‘eyes’).inspector({maxLength:20000});
var pdf_extract = require(‘pdf-extract’);
var absolute_path_to_pdf = ‘~/Downloads/electronic.pdf’
var options = {
type: ‘text’ // extract the actual text in the pdf file
}
var processor = pdf_extract(absolute_path_to_pdf, options, function(err) {
if (err) {
return callback(err);
}
});
processor.on(‘complete’, function(data) {
inspect(data.text_pages, ‘extracted text pages’);
callback(null, data.text_pages);
});
processor.on(‘error’, function(err) {
inspect(err, ‘error while extracting pages’);
return callback(err);
});

})

user-1 · April 5, 2024, 12:10pm

It looks like you’re attempting to use both import and require syntax in the same Node.js step, which is not supported in Pipedream’s environment. Additionally, the pdf-extract package might not work as expected within Pipedream due to its dependencies. Let’s address these issues and provide a solution that fits within Pipedream’s constraints. First, let’s correct the syntax issue. In Pipedream’s environment, you should use either import or require, but not both in the same step. Since Pipedream supports ES Module syntax, let’s stick with import. However, note that not all npm packages might be compatible with Pipedream’s environment, especially those that rely on binaries or have complex dependencies, like pdf-extract might. Given these constraints, if you’re looking to extract text from a PDF, you might consider using a different approach or a cloud service API that can perform this task and is accessible via HTTP requests. Here’s an example of how you might structure a request to a hypothetical API that extracts text from PDFs, using the axios package: import { axios } from "@pipedream/platform"; export default defineComponent({ async run({ steps, $ }) { try { const response = await axios($, { method: "POST", url: "https://api.example.com/pdf/extract-text", headers: { "Content-Type": "application/json", "Authorization": "Bearer YOUR_API_KEY", }, data: { // Your request payload here, possibly including a URL to the PDF or the PDF content itself }, }); console.log("Extracted text:", response.data); return response.data; } catch (error) { console.error("Error extracting text from PDF:", error); throw error; } }, }); Please replace "https://api.example.com/pdf/extract-text" with the actual URL of the API you’re using, and "YOUR_API_KEY" with your actual API key or token for that service. This example assumes there’s an external service that can extract text from PDFs and is accessible via an HTTP API. If you’re set on using a specific npm package that doesn’t work in Pipedream’s environment, you might need to run that code elsewhere, such as your own server or a cloud function, and then call that service

user-1 · April 5, 2024, 12:10pm

I want to run the below code, please check it

import { PDFExtract } from ‘pdf.js-extract’;
import fetch from “node-fetch”;
export default defineComponent({
async run({ steps, $ }) {
// const url = “https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf 8”;
const url = steps.trigger.event.body.text;
const response = await fetch(url);
const buffer = await response.buffer();

const options = { };
const pdfExtract = new PDFExtract();
let data;

try {
  data = await pdfExtract.extractBuffer(buffer, options);
  console.log(data);
} catch (err) {
  console.log("Error extracting PDF data:", err);
} finally {
  return data;
}

},
});

user-1 · April 5, 2024, 12:10pm

To run the code you’ve provided in Pipedream, we need to make a few adjustments for compatibility with the Pipedream environment. Specifically, we’ll replace the fetch import with the axios package provided by @pipedream/platform for making HTTP requests, and ensure that the pdf.js-extract package is correctly used within the Pipedream environment. Unfortunately, pdf.js-extract isn’t directly supported in Pipedream’s environment, and you cannot import it like in a standard Node.js environment. However, I’ll guide you on how to achieve similar functionality using available tools and APIs. Since we can’t use pdf.js-extract, I’ll show you how to fetch a PDF file from a URL and save it temporarily in the /tmp directory, which is a common step before processing files. For PDF data extraction, you might need to use an external API or service that supports PDF parsing and can be called via HTTP requests from Pipedream. import { axios } from "@pipedream/platform"; import fs from "fs"; export default defineComponent({ async run({ steps, $ }) { const url = steps.trigger.event.body.text; const response = await axios($, { method: "GET", url: url, responseType: "arraybuffer", }); // Save the PDF temporarily in the /tmp directory const path = `/tmp/downloaded.pdf`; await fs.promises.writeFile(path, response, "binary"); console.log(`PDF saved to ${path}`); // Here you would typically call an external API to process the PDF // Since pdf.js-extract is not available, consider using an API like [PDF.co](http://PDF.co), Adobe PDF Services, etc. // Example (pseudo-code): // const pdfData = await callExternalPdfProcessingApi(path); // return pdfData; // For demonstration, we're just returning a message return { message: "PDF saved. Implement PDF processing with an external API." }; }, }); This code snippet demonstrates how to fetch a PDF from a URL and save it to the /tmp directory. For actual PDF data extraction, you’ll need to integrate with an external service capable of processing PDFs, as direct usage of pdf.js-extract is not feasible within Pipedream’s current environment. Remember, when working with external APIs for PDF processing, you

user-1 · April 5, 2024, 12:10pm

im having this code below, i want the result to be usable in the next pipedream step and not just printed.

import pdfplumber # Install: pip install pdfplumber

def extract_text_from_pdf(pdf_path):
“”"Extracts text from a PDF using pdfplumber.

Args:
    pdf_path (str): Path to the PDF file.

Returns:
    str: Extracted text from the PDF.
"""

**with** pdfplumber.**open**(pdf_path) **as** pdf:
    pages = pdf.pages
    extracted_text = ""
    **for** page **in** pages:
        extracted_text += page.extract_text()
**return** extracted_text

# Replace with the actual path to your PDF file
pdf_path = “/tmp/data.pdf” # Adjust the path as needed
extracted_text = extract_text_from_pdf(pdf_path)

# Send the extracted text to Pipedream (modify this part according to your Pipedream integration setup)
# Example using a hypothetical Pipedream API endpoint:
# import requests
# requests.post(“https://api.pipedream.com/v1/your-endpoint”, data={“text”: extracted_text})

# Or, for testing purposes, simply print the extracted text:
print(extracted_text)

user-1 · April 5, 2024, 12:10pm

To make the extracted text from the PDF usable in the next step of your Pipedream workflow, you’ll need to modify your code to fit the Pipedream component model. Since you’re working with Python code, here’s how you can adapt your existing function to work within a Pipedream Python code step: import pdfplumber # Ensure pdfplumber is available in Pipedream's Python environment def handler(pd: "pipedream"): pdf_path = "/tmp/data.pdf" # Adjust the path as needed # Define the function to extract text from PDF def extract_text_from_pdf(pdf_path): """Extracts text from a PDF using pdfplumber. Args: pdf_path (str): Path to the PDF file. Returns: str: Extracted text from the PDF. """ with pdfplumber.open(pdf_path) as pdf: pages = pdf.pages extracted_text = "" for page in pages: extracted_text += page.extract_text() if page.extract_text() else "" return extracted_text # Call the function and store the result extracted_text = extract_text_from_pdf(pdf_path) # Return the extracted text so it can be used in the next step return {"extracted_text": extracted_text} This Python code is structured to fit within a Pipedream Python code step. It defines a handler function that Pipedream will execute, which includes your original extract_text_from_pdf function. The key difference is that instead of printing the extracted text, this code returns a dictionary with the extracted text under the key "extracted_text". This makes the extracted text available as an output of the step, which you can then use in subsequent steps of your workflow. Remember, when using external libraries like pdfplumber in Pipedream, you should ensure they are available in Pipedream’s Python environment. If you encounter issues with library availability, you might need to look for alternative methods or reach out to Pipedream support for assistance.

user-1 · April 5, 2024, 12:10pm

I want to use this code to get data from the previous step adn pass it on to next step structured.

import re

# Access the extracted text from the previous step in Pipedream
extracted_text = steps.Extract_TXT_PDF.$return_value.extracted_text

# Extract date
date_match = re.search(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", extracted_text)
date = date_match.group(0) if date_match else None

# Extract shipping information and name pairs
shipping_info_and_name_pairs =
for match in re.finditer(r"(?<=Inlämning Kolli-ID/inl.nr: RR\d{12}SE)[\s\S]**?(?=Mottagare: )", extracted_text):
shipping_info_and_name_pairs.append(match.group(0))

# Extract shipping information and name from each pair
shipping_information =
names =
for pair in shipping_info_and_name_pairs:
shipping_info_match = re.search(r"Inlämning Kolli-ID/inl.nr: RR(\d{12})SE", pair)
shipping_information.append(shipping_info_match.group(1) if shipping_info_match else None)
name_match = re.search(r"Mottagare: (.**)", pair)
names.append(name_match.group(1) if name_match else None)

# Create groups
groups =
for i in range(len(shipping_information)):
groups.append({“kolli-id”: shipping_information[i], “name”: names[i]})

# Print results (Optional for testing purposes)
print(f"Date: {date}“)
print(“Groups:”)
for group in groups:
print(f”\tKolli-ID: {group[‘kolli-id’]}“)
print(f”\tName: {group[‘name’]}")

# You can now use the ‘groups’ variable for further processing in Pipedream
# Here’s an example of passing it as JSON (modify as needed):
# import json
# json_data = json.dumps(groups)
# # Pass json_data to the next step in Pipedream

user-1 · April 5, 2024, 12:10pm

To adapt the provided Python code for use in a Pipedream Python code step, and to ensure it can pass data to subsequent steps in a structured manner, follow the structure outlined in the Pipedream Python documentation. The adapted code below demonstrates how to access data from a previous step, process it as described, and return the processed data for use in downstream steps. import re def handler(pd: "pipedream"): # Access the extracted text from the previous step in Pipedream extracted_text = pd.steps["Extract_TXT_PDF"]["$return_value"]["extracted_text"] # Extract date date_match = re.search(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", extracted_text) date = date_match.group(0) if date_match else None # Extract shipping information and name pairs shipping_info_and_name_pairs = [] for match in re.finditer(r"(?<=Inlämning Kolli-ID/inl.nr: RR\d{12}SE)[\s\S]**?(?=Mottagare: )", extracted_text): shipping_info_and_name_pairs.append(match.group(0)) # Extract shipping information and name from each pair shipping_information = [] names = [] for pair in shipping_info_and_name_pairs: shipping_info_match = re.search(r"Inlämning Kolli-ID/inl.nr: RR(\d{12})SE", pair) shipping_information.append(shipping_info_match.group(1) if shipping_info_match else None) name_match = re.search(r"Mottagare: (.**)", pair) names.append(name_match.group(1) if name_match else None) # Create groups groups = [] for i in range(len(shipping_information)): groups.append({"kolli-id": shipping_information[i], "name": names[i]}) # Return the groups variable for further processing in Pipedream return {"date": date, "groups": groups} This code snippet demonstrates how to structure your Python code within a Pipedream workflow. It accesses data from a previous step, processes it according to the regex patterns provided, and returns the processed data in a structured format that can be used by downstream steps. Remember, the pd.steps object allows you to access data from

user-1 · April 5, 2024, 12:10pm

with below code i get the date but the groups doesnt work containing kolliid and name

They are dynamic and of an unknown amount each run but below is an example

Inlämning Kolli-ID/inl.nr: RR094666985SE
Mottagare: Jonas Carlsson
Inlämning Kolli-ID/inl.nr: RR942941627SE
Mottagare: John Hope
… and so on, each group have two lines and then it just continues

import re

def handler(pd: “pipedream”):
# Access the extracted text from the previous step in Pipedream
extracted_text = pd.steps[“Extract_TXT_PDF”][“$return_value”][“extracted_text”]

_# Extract date_
date_match = re.search(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", extracted_text)
date = date_match.group(0) **if** date_match **else** **None**

_# Extract shipping information and name pairs_
shipping_info_and_name_pairs = []
**for** **match** **in** re.finditer(r"(?<=Inlämning Kolli-ID/inl.nr: RR\d{12}SE)[\s\S]**?(?=Mottagare: )", extracted_text):
    shipping_info_and_name_pairs.append(**match**.group(0))

_# Extract shipping information and name from each pair_
shipping_information = []
names = []
**for** pair **in** shipping_info_and_name_pairs:
    shipping_info_match = re.search(r"Inlämning Kolli-ID/inl.nr: RR(\d{12})SE", pair)
    shipping_information.append(shipping_info_match.group(1) **if** shipping_info_match **else** **None**)
    name_match = re.search(r"Mottagare: (.**)", pair)
    names.append(name_match.group(1) **if** name_match **else** **None**)

_# Create groups_
groups = []
**for** i **in** **range**(**len**(shipping_information)):
    groups.append({"kolli-id": shipping_information[i], "name": names[i]})

_# Return the groups variable for further processing in Pipedream_
**return** {"date": date, "groups": groups}

user-1 · April 5, 2024, 12:10pm

It looks like you’re trying to extract shipping information from a text string using Python regular expressions, but you’re encountering issues with grouping the data correctly. The main issue seems to be with the regular expression used to extract the shipping information and name pairs. Let’s adjust the approach to correctly capture the groups of Kolli-ID and names. Here’s an updated version of your code that should correctly extract the groups of Kolli-ID and names: import re def handler(pd: "pipedream"): # Access the extracted text from the previous step in Pipedream extracted_text = pd.steps["Extract_TXT_PDF"]["$return_value"]["extracted_text"] # Extract date date_match = re.search(r"\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}", extracted_text) date = date_match.group(0) if date_match else None # Extract shipping information and name pairs pattern = r"Inlämning Kolli-ID/inl.nr: (RR\d{9}SE)\nMottagare: (.*)" matches = re.findall(pattern, extracted_text) # Create groups groups = [{"kolli-id": match[0], "name": match[1]} for match in matches] # Return the groups variable for further processing in Pipedream return {"date": date, "groups": groups} This code uses re.findall with a pattern that matches the “Inlämning Kolli-ID/inl.nr” line followed by the “Mottagare” line. The pattern captures the Kolli-ID and name as separate groups, which are then used to create the list of dictionaries you’re looking for. This approach should work as long as the text structure remains consistent with the example you’ve provided.

user-1 · April 5, 2024, 12:10pm

im running this code below,

I want to get the text data from previous step “steps.code.$return_value.groups”

import requests # Install with: pip install requests

# WooCommerce API details (replace with your credentials)
base_url = “https://your-store.com/wp-json/wc/v3/”
consumer_key = “ck_bfa98862a474598945e1249fdab93bfc0d60”
consumer_secret = “cs_2cd2c9da79546664237abd06f61a9659”

# Dynamic list of kolli-ids
kolli_ids = [
{“kolli-id”: “RR094592483SE”, “name”: “May Daniels”},
{“kolli-id”: “RR094094327SE”, “name”: “Dario Hollers”},
# … other kolli-id entries
]

# Authentication (replace with your own method)
auth = requests.auth.HTTPBasicAuth(consumer_key, consumer_secret)

def find_order_by_kolli_id(kolli_id):
“”“Searches WooCommerce orders for the given kolli-id.”“”
url = f"{base_url}orders?search={kolli_id}"
response = requests.get(url, auth=auth)
response.raise_for_status() # Raise an exception for non-200 status codes

orders = response.json()
**if** orders:
    _# Assuming the kolli-id is unique and only appears in one order_
    **return** orders[0]
**else**:
    **return** **None**

def send_customer_message(order_id, message):
“”“Sends a customer message to the order.”“”
url = f"{base_url}orders/{order_id}/customer-notes"
data = {“note”: message}
response = requests.post(url, auth=auth, json=data)
response.raise_for_status()

for kolli_id_data in kolli_ids:
kolli_id = kolli_id_data[“kolli-id”]
order = find_order_by_kolli_id(kolli_id)

**if** order:
    message = f"Din order {order['number']} blev tidigare inlämnad till PostNord"
    send_customer_message(order["id"], message)
    **print**(f"Message sent for order: {order['number']}")
**else**:
    **print**(f"No order found for kolli-id: {kolli_id}")