Is it Possible to Configure Pipedream to Run docTR as Described in the Provided Link?

This topic was automatically generated from Slack. You can find the original thread here.

Hello everyone, everything good ?
I would like to know if it is possible to do this configuration in pipedream to run docTR
thanks

Hi Marcelo, thanks for the link.

Both of these settings are configurable by environment variables, so you can set these within your Pipedream dashboard to comply with these compatibility issues.

  1. Disable the usage of the multiprocessing package by setting the DOCTR_MULTIPROCESSING_DISABLE environment variable to TRUE. This step is necessary because the package uses the /dev/shm directory for shared memory.
    So setting DOCTR_MULTIPROCESSING_DISABLE to true in your workspace environment variables in Pipedream should help with this configuration.
  1. Change the caching directory used by docTR for models. By default, it is set to ~/.cache/doctr, which is outside the /tmp directory on AWS Lambda. You can modify this by setting the DOCTR_CACHE_DIR environment variable.
    So you can set the DOCTR_CACHE_DIR to /tmp within your workspace’s environment varibles, and that should help with that configuration.

Thanks Pierce for the guidance, but I’m facing the Import Error
libGL.so.1: cannot open shared object file: No such file or directory

You might want to report that to the maintainers of the project. There might be an undocumented environment variable you’ll need to set in order to exclude that dependency.

ok then, thanks again Pierce !

Sure thing :slightly_smiling_face:

Hi Pierce, how are you? Would it be possible for you to help me clarify the mindee/doctr repository team in the issue I opened?

or maybe you can help me answer them

Thanks - the opencv dependency wasn’t listed in that initial requirements page you sent me. I didn’t realize that was required in the environment.

Unfortunately at this time we don’t support custom docker containers so you can pick and choose your own dependencies for workflows. However, this is an active discussion internally. We know there are cases were you need to bring in binaries, but we can’t install custom binaries on request because it would install unwanted binaries to all Pipedream customer workflows.

Please leave a comment on this Github issue with your use case, it helps us aggregate all of the potential use cases and make sure yours is included as we design this feature:

ok, thanks Pierce