What is causing repeated RSS Item posts on Mastodon/Twitter?

This topic was automatically generated from Slack. You can find the original thread here.

Ugh - so my “On new RSS Item, post to Mastodon/Twitter” has broken twice now. I do a new blog entry, and all my RSS Items fire, not just the new one. I need to disable it until it’s fixed - but any idea what’s going wrong?

Looking at the workflow, I can see 10 new events today - my entire feed, even though only one item is new.

Can you share the URL of your feed?

https://www.raymondcamden.com/feed.xml

does it only process the events again intermittently? Or is it emitting the duplicate events on each run?

its definitely only firing when i do a new blog post, so it gets that part, but then it fires once for each item in the feed

Can you DM me your workflow URL? I just need to get the workflow ID

I have a mock RSS feed producing a new event every 15 minutes and haven’t seen duplicate emits on new runs, when I get new items in the feed, but there may be some non-deterministic behavior and I’d like to look into it

is it possible that any of the information in a given entry would change over time?

<id>https://www.raymondcamden.com/2022/11/25/a-bare-bones-eleventy-template-for-glitch</id>
<title>A Bare-Bones Eleventy Template for Glitch</title>
<updated>2022-11-25T18:00:00+00:00</updated>
<link href="https://www.raymondcamden.com/2022/11/25/a-bare-bones-eleventy-template-for-glitch" rel="alternate" type="text/html" title="A Bare-Bones Eleventy Template for Glitch"/>
<content type="html"> <p>A few weeks ago I blogged about a simple <a href="https://www.raymondcamden.com/2022/10/28/an-alpinejs-template-for-glitch">Alpine.js template for Glitch</a> projects. I'm still new to <a href="https://glitch.com">Glitch</a> and wanted to give it a whirl with an Eleventy demo I wanted to share. Glitch has an Eleventy template, but it's a bit verbose. It sets up a basic blog with sample posts and such, and that's great to learn, but if you already know Eleventy, you may prefer to start off a bit simpler.</p><p>With that in mind, I created this repository: <a href="https://github.com/cfjedimaster/glitch-eleventy">https://github.com/cfjedimaster/glitch-eleventy</a> It defines an <code>.eleventy.js</code> file that specifies an input and output directory. It sets up a very basic HTML layout and an empty index page that uses it. I also used Liquid for my demo whereas the Glitch-provided one uses Nunjucks.</p><p>I was tempted to add a very basic style sheet (by basic I mean empty) and ensure Eleventy copied it to the output, but wasn't sure how often I'd use that in demos. As always, I'm open to suggestions (and PRs!) on this, but my goal is to keep this as slim as possible. If folks create new projects based on my repo and have to spend time removing stuff, then that's a failure imo. Anyway, let me know if this is helpful!</p><p>Photo by <a href="https://unsplash.com/@lazycreekimages?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Michael Dziedzic</a> on <a href="https://unsplash.com/s/photos/glitch?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></p> </content>
<category term="javascript"/>
<category term="eleventy"/>
<category term="development"/>
<author>
<name>Raymond Camden</name>
<email>raymondcamden@gmail.com</email>
</author>

Currently we would hash the entire entry / item and use that to dedupe a given entry: pipedream/rss.app.ts at master · PipedreamHQ/pipedream · GitHub . If the item has both a pubdate and guid , or date_published and id, we’ll use one of those compound keys as the key of the item

i cant see how, unless i edit the feed generator, which ive not in a while

is it possible to look at the entry from an event today, and from my last post yesterday?

yeah I’ll compare the content of the emitted events

yeah if you visit https://pipedream.com and select the event for the post “Automatically Posting to Mastodon and Twitter on New RSS Items” emitted today, you’ll see a pubDate within the meta field of 2022-12-08T21:53:22.000Z. Select the event for the same post that was emitted on Dec 6th, meta.pubDate is 2022-12-06T16:27:25.000Z

oh wow

I think we can probably simplify the deduper logic to use just the post id, but I’d like to investigate that more before we modify

so i dont have pubdate in my feed, i have entry. but i bet your rss parser simplifies it to pubdata

well shoot

looking at updated, i see firm time values: <updated>2022-12-08T18:00:00+00:00</updated>

ie days change, not times

looking at Events in the source UI, looks like pubDate is being set to now?