Building an RSS reader in Javascript

Jenn - Jul 16 '20 - - Dev Community

Blogging never went away and neither did RSS feeds. RSS (Really Simple Syndication) is a web feed to check for updates on sites. It used to be quite popular with several different apps devoted to reading these feeds. In addition, many browsers used to have RSS readers built in. Sadly, RSS has fallen out of popularity probably due to social media and other feeds taking over its role.

But last night on a whim, I decided to build a personal RSS reader with Vanilla JS. Okay, it wasn't a whim, Twitter was on fire and what better way to distract myself from my usual distraction than creating a new web application?

The tools

Setting up the project

Whenever I start a new project, I look for examples of the idea or similar ideas to build from. I did a search for "rss reader javascript" and I came across several older projects and tutorials. Many of them were written in older syntax or used frameworks. I found one tutorial that used the (then new) Fetch API and decided to build on that.

Finding RSS feeds

Finding feeds is harder than it used to be back in 2006. I searched online for example feeds to pull from and decided on dev.to's feed, Codepen's feed, and Mozilla Hacks Blog. I tested that I could reach all the feeds in the browser and was served the appropriate XML.

An example RSS XML document from my personal dev.to feed.

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>Jenn</title>
    <author>Jenn</author>
    <description>Jenn is a self taught web developer who specializes in usability and accessibility.  She is easily spotted at conferences by her bright lipstick and various code dresses and t-shirts.</description>
    <link>https://dev.to/geekgalgroks</link>
    <language>en</language>
    <item>
    ...
    </item>
  </channel>
</rss>

Database

I wanted a way to easily update my list of feeds without having to update my code. I used Google's Firestore for several other little projects and spun up a new collection called rssfeed. I decided the only thing I needed was the url and added four RSS feed urls to the collection.

Diving straight in

As I had written other little Javascript web apps that used Firestore, I started by copying what I did in that project.

I created a global variable to hold my feeds and queried the database to push the URL values into it.

    const database = firebase.firestore().collection('rssfeed');
    database.get().then((querySnapshot) => {
        querySnapshot.forEach((doc) => {
            feeds.push({
                id: doc.id,
                url: doc.data().url
            });
        });
    });

First problem

I was getting 404 errors in my console. I realized I forgot to set the Firestore database rules to allow reading of the collection.

I copied the rules of a previous collection and after waiting a bit, confirmed they worked.

    match /rssfeed/{feedId} {
        allow read;
        allow write: if request.auth.uid == 'REDACTED';
    }

I was now able to console log the value of the array and confirm everything was working.

Doing too much at once

Spurred on by new success I continued on. I built a function that used Fetch to get the title of a feed. I used a foreach loop on my array and called it.

I got a bunch of odd errors.

CORS and Promises

The first error message that made sense in the console was about CORS.

CORS

CORS stands for Cross Origin Resource Sharing. CORS protects sites from calling assets (Javascript, images, apis, etc) from other websites. Some sites protect all their assets, others explicitly let others use some or all of them.

Some of the feeds were being protected by CORS.

At the time I thought it was all the feeds. I looked up how to add CORS modes to my Fetch call.

// No CORS, this is an "opaque" mode that limits what headers are sent.
fetch(feed, {mode: no-cors});

This didn't help. I started looking at proxies and other solutions.

Searching again

I was frustrated. I searched again for projects. Looking for something newer that might give me insight on how to combat CORS.

I stumbled across CSS-Tricks How to Fetch and Parse RSS Feeds in JavaScript. It had a working example and was written in 2020!

I commented out all of my code and pasted their example in, everything worked. I changed the hardcoded URL from Codepen to my dev.to feed, everything still worked. I wrapped the fetch call in a function and tested again, it worked. I was feeling great. I added back in my database call and using a foreach on my array, called the function.

It didn't work because my array wasn't populated yet, it just held promises.

Promises

Promises are placeholders. Asynchronous functions return promises instead of blocking everything on the page while they work. The function promises to get you a result.

My array was full of promises. Fetch couldn't pull down content from a promised URL, it needed the real thing.

This is where then comes in handy. It waits until the asynchronous function completes and then does the next thing. I removed my global variable (should not have made it a global anyway), moved the return statement up on my database call, and chained in my fetch call.

It worked!

Except I had three results, not four.

CORS strikes again

The Mozilla blog was protected by CORS. Instead of fighting it more, I just removed the url from my database. Some battles are not worth fighting.

The final code

My completed reader can be found on my portfolio site. I have included an HTML snippet and the full javascript file below. CSS is omitted because not everyone loves pastels.

HTML Snippet

<main id="content">
      <h1>Jenn's Glorious RSS Reader</h1>
      <p>Implemented in Vanilla JS because blogging never died.</p>
</main>

JS

function getRssFeed(feed) {
    fetch(feed)
        .then(response => response.text())
        .then(str => new window.DOMParser().parseFromString(str, "text/xml"))
        .then(data => {
            const items = data.querySelectorAll("item");
            let html = ``;
            html += `<h2>${data.querySelector("title").innerHTML}</h2>`;
            html += `<p>${data.querySelector("description").innerHTML}</p>`;
            html += `<div class="feeds">`;
            items.forEach(el => {
                html += `
        <article>
          <h3>
            <a href="${el.querySelector("link").innerHTML}" target="_blank" rel="noopener">
              ${el.querySelector("title").innerHTML}
            </a>
          </h3>
        </article>
      `;
            });
            html += `</div>`;
            document.getElementById('content').insertAdjacentHTML("beforeend", html);
        });
}
function getFeeds() {
    let feeds = [];
    const database = firebase.firestore().collection('rssfeed');
    database.get().then((querySnapshot) => {
        querySnapshot.forEach((doc) => {
            feeds.push({
                id: doc.id,
                url: doc.data().url
            });
        });
        return feeds;
    }).then(function (feeds) {
        displayFeeds(feeds);
    });
}
function displayFeeds(feeds) {
    feeds.forEach(feed => { getRssFeed(feed.url); });
}
getFeeds();

All in all, it took around four hours to write. Much of that time was troubleshooting and research. It probably would have been faster if I wasn't tired and didn't try to do too many things at once in the beginning.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .