Tag Archives: some

Join mastodon and slowly posting a wordpress RSS/atom feed

Mastodon is a social network that appears similar to twitter: you get a feed where you can see “toots” instead of “tweets” and you can send your own toots that will be seen by others listening to you or looking for a hashtag found in your toot.

Mastodon differs from twitter in that, like diaspora, it is not under corporate control. Also like diaspora, Mastodon is a non-profit, user-owned, distributed social network. The mastodon software is free software (GNU AGPL-3.0 or later). The mastodon server side is written in ruby and the mastodon frontend is written in react/redux. And also, like diaspora, it is part of the fediverse.

The first thing to do is to find a server to join. Again like diaspora, mastodon consist of interconnected servers. The servers are owned by different people with different rules for joining and different rules for what can be posted. The rules for registration can be quite different.

Note: you can share posts and listen to postings across servers. You’re not limited to the server you’re joining.

I looked at a couple of server and found them too restrictive, and ended up joining mastodon.social.

I can be reached on mastodon with @steinarb@mastodon.social.

The next thing to do, was to find a way to slowly post my wordpress feed in chronolgical order.

I quickly found a utility called feediverse. It’s written in python, and according to its author, written over the course of a weekend. According to the commits on feediverse’s github page it was written 2 years ago and has received no commits since then.

But it still does the job.

The first thing to do was install it on my debian server, as root, first install pip3, then install feediverse:

apt install python3-pip
pip3 install feediverse

But that took me only part of the way, because pointing that at my wordpress feed, would have fed the entire feed to mastodon in reverse chronological order. And like for diaspora, I wanted to post the feed one post a day in reverse order.

Note: The diaspora post contains changes to the wordpress feed (make the feed contain only the summaries of all posts on the blog) that should be applied to get the expected results.

I made the github issue Make it possible to slowly toot a comple feed in cronological order for feediverse with the intent to provide a PR on this.

However, I decided on a simpler solution: create a separate script that reads the feed and posts the entries, one post at a time, to a local file.

To use this script on a debian system:

  1. As root, install the dependencies of the script, using pip3:

    pip3 install pyyaml
    pip3 install feedparser
    pip3 install python-dateutil
    pip3 install feedgen
    pip3 install beautifulsoup4
    
  2. As your own user:
    1. Clone the github repo of the script:

      mkdir -p $HOME/git
      cd $HOME/git
      git clone https://github.com/steinarb/feedreverser.git
      
    2. Run feedreverser once manually with the following command:

      /usr/bin/python3 $HOME/git/feedreverser/feedreverser.py
      
    3. When prompted, give the feed URL (you will find this by right-clicking on an RSS symbol in your blog and copy the URL) and give /tmp/reversed.rss as the file to store the reversed feed in
    4. Add a crontab entry that runs the script once every 24h:

      10 6 * * * /usr/bin/python3 $HOME/git/feedreverser/feedreverser.py
      

At this point, /tmp/reversed.rss contains the oldest post in the feed, and within 24h it will be replaced by the second oldest post.

So the next thing to do, is to set up feediverse to post the contents of the feed in this local file, to mastodon.

First create an app in mastodon:

  1. Click on “Preferences”
  2. In the page that opens, click on “Development”
  3. In “Your applications”, click on “New application”
  4. In “New application”:
    1. In application name, give:

      feediverse
      
    2. Make sure read and wite are checked (they are, by default)
    3. Click on “Submit”
  5. In “Your applications”, click on “feediverse”
  6. Make a note of the values for access token and client secret. You will need them when doing the initial setup run of feediverse

Then, as your own user, run feediverse from the command line:

feediverse

When prompted for

  1. “What is your Mastodon Instance URL”, give the top URL for your mastodon instance (for me, it is https://mastodon.social )
  2. “Do you have your app credentials already? [y/n]”, answer “y”
  3. “What is your app’s client id”, answer:

    feediverse
    
  4. “What is your client secret”, give the client secret you made a note of, earlier
  5. “access_token”, give the access token you made a not of, earlier
  6. “RSS/Atom feed URL to watch”, give a file URL for the file you told feedreverser to write to, i.e.

    file:///tmp/reversed.rss
    
  7. Open the $HOME/.feediverse file (a YAML file), in a text editor, and change the line

    - template: '{title} {url}'
    

    to

    - template: '{title} {url} {hashtags} {summary}'
    
  8. If your first post contains hashtags and a summary you would like to see as much of as possible in the toot, then delete the “updated:” line in the .feediverse file and re-run the feediverse command (mine didn’t)

Then add a crontab entry that runs feediverse once every 15 minutes:

15 * * * * /usr/local/bin/feediverse 2>/dev/null

The error redirect is because of the feediverse issue yaml warning from feediverse (I have provided a PR for this warning, referenced from the issue).

At this point feediverse is set up to slowly post your wordpress feed one feed entry a day.

If you let the setup continue past the emptying the feed backlog it will still work, but it will only post a single feed entry per day.

If you want to make feediverse post new entries to your feed just:

  1. In crontab, remove the entry for feedriverser
  2. In the $HOME/.feediverse file, replace:

    url: file:///tmp/reversed.rss
    

    with the feed URL you gave to feedreverser, e.g.

    url: https://steinar.bang.priv.no/feed/atom/
    

But note that there are some feediverse issues that might bite you, that feedreverser protects you from, by fixing them:

  1. Correctly handle hashtags with spaces: wordpress categores and tags with spaces in them, are currently posted as separate hashtags (e.g. “debian 8” becomes “#debian #8”). I have provided a PR for this, that replaces the spaces in categories and tags, with underscores
  2. toots with more than 500 characters fails: there is a typo in the commit that attempted to correct this, I have provided a PR that fixes the typo
  3. Would it be able to change html to readable plain text, and post images?: This means that titles with character entities in them show up sort of unreadable e.g. “Installing apache karaf on debian stretch”, and descriptions with both HTML tags and character entites can get very unreadable. The feedreverser script uses BeautifulSoup to fix this. There is at the point of writing no PR to feediverse to fix this

If you want a version of of feediverse that fixes the first two, you can get my fork of feediverse and use that instead:

  1. Clone my fork of feediverse and use the branch where I’ve combined my PRs:

    mkdir -p $HOME/git
    cd $HOME/git
    git clone https://github.com/steinarb/feediverse.git
    cd feediverse
    git checkout steinarsbranch
    
  2. Change the crontab line to run the cloned script instead of the script installed with pip3 (note that the need to redirect the error output is gone because the yaml issue is fixed):

    15 * * * * /usr/bin/python3 $HOME/git/feediverse/feediverse.py
    

Join diaspora and slowly posting a wordpress RSS/atom feed

Diaspora is a social network that appears to similar to Facebook in its behaviour: you get a web UI with a feed, and what ends up in that feed comes from your friends and your groups and what hashtags you filter for.

Diaspora differs from Facebook in that it is not under corporate control. Diaspora, according to its wikipedia entry, is a non-profit, user-owned, distributed social network. Diaspora’s software is free software (GNU AGPL-3.0) and written i ruby on rails.

To start using Diaspora, one first have to decide on which “pod” (i.e. Diaspora instance), to join. In principle you can see everything posted by any user in the Diaspora fediverse, but there may be limitations in what the pod owner allows. I didn’t do much searching, I found a pod with a .no address and registered there. The first thing I was asked to do was to create a “Here I am” posting with a lot of hashtags referring to my interests.

I did that, and received several welcome replies. So that was kind of friendly.

If you want to see the postings described in this article, add steinarb@diasporapod.no to your contacts.

I then started the process of posting the RSS feed of this blog to diaspora.

Ideally I would have liked to post the feed entries in chronological order, with the original posted date they were posted.

However, the REST API of diaspora doesn’t allow setting the date of a post. The date you post is the date you get. So posting with the original date was out.

Next best would be to post the wordpress feed entries in blog post chronological order at a slowed rate, perhaps 1 post a day, so I wouldn’t bee seen to flood the diaspora feeds. So this is what I set out to do.

The first thing I had to do was to adjust the RSS feeds on wordpress, to get everything to the start of the feed. WordPress was set up to only show the 10 most recent blog posts in the feed. The feed was also set up to contain the entire bodies of the blog posts, and that was a little too much for this use case. So I adjusted the feeds to list only the blog post summaries.

I googled for a tool to do the posting, found this list on the diaspora wiki. The python script pod_feeder_v2 was at the start of the list, so that’s the one I tried first and that’s the one I stayed with.

Installing the pod_feeder_v2 was done by doing the following commands, logged in as root, on a debian system:

  1. First installed python3 pip

    Listing 1.

    apt install python3-pip
    
  2. Then installed pod_feeder_v2

    Listing 1.

    pip3 install pod-feeder-v2
    

The way pod_feeder_v2 works is that it first reads all posts from the RSS feed it listens to, and then stores all entries it hasn’t seen before into a sqlite table (identified by the feed entry GUID). It then traverses the sqlite table in chronological order and posts the unposted entries, marking entries in the table as they are posted, so they won’t be posted more than once.

The time stamp in the sqlite table entries, is the time the entries are written in the table. Since the wordpress feeds post the newest entries first, pod_feeder_v2 on initial read puts the entries in the wrong order, with (more or less) the same time stamp.

So what I tried to do, was to first fill up the database with the existing posts, and then semi-manually correct the time stamp of each post in the database (34 posts, stretching back to 2012) and then start posting from the table 1 post per day.

Luckily pod_feeder_v2 has a command line option –fetch-only, that allows reading the feed and updating the sqlite table, but not posting. So I ran this command to populate the table in the feed.db (I ran this command logged in with my regular user account, not as root):

Listing 1.

pod-feeder --feed-id steinarblog --category-tags --ignore-tag uncategorized --feed-url https://steinar.bang.priv.no/feed/atom/ --pod-url https://diasporapod.no --fetch-only

This adds both the wordpress category and the wordcloud tags as hashtags in the post. I removed the tag uncategorized from the hashtags.

To manipulate the database table I installed the sqlite command line tool, by doing the following command, logged in as root:

Listing 1.

apt install sqlite3

I started the sqlite3 command line tool logged in as my own user account (“feed.db” is the file holding the sqlite database. It is created in the home directory of the user running pd-feeder):

Listing 1.

sqlite3 feed.db

I listed the available GUIDs:

Listing 1.

SELECT guid FROM feeds WHERE feed_id == 'steinarblog';

I took the GUIDs into an emacs SQL buffer, and manipulated them to become update lines, setting the time stamp to be at the time of the update. Then I reversed the lines in the buffer, so that the oldest was first:

Listing 1.

update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=1';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=7';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=14';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=23';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=26';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=33';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=44';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=46';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=53';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=63';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=60';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=66';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=83';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=165';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=171';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=191';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=196';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=209';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=214';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=223';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=224';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=238';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=250';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=255';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=261';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=269';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=265';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=286';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=292';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=306';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=332';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=342';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=353';
update feeds set timestamp = CURRENT_TIMESTAMP where guid = 'https://steinar.bang.priv.no/?p=310';

I manually pasted each line into sqlite3, one line at a time, waiting a couple of seconds between each paste. It was only 34 lines to paste so this seemed the simplest way. I watched television as I was doing this.

The next bit to do was to post the articles in the database, one post per day.

I set up a cronjob to run once a day, posting from the database, and using “–limit 1” to ensure only one post was done each day:

Listing 1.

0 6 * * * /usr/local/bin/pod-feeder --feed-id --summary steinarblog --category-tags --ignore-tag uncategorized --feed-url https://steinar.bang.priv.no/feed/atom/ --limit 1 --pod-url https://diasporapod.no --username steinarb --password xxxxx --quiet

The pod-feeder will only attempt to post entries for 72 hours, so to avoid the unposted entries being forgotten, I added another cronjob bumping the timestamp of unposted articles with 24 hours:

Listing 1.

5 6 * * * /usr/bin/sqlite3 feed.db 'update feeds set timestamp = DATETIME(timestamp, "+1 day") where posted = 0;'

With this setup one article per day will be posted until the feed is empty. At that point I will change the pod-feeder cronjob to run more often than once a day and remove the job bumping the timestamps of unposted entries.