element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Upcycle It
  • Challenges & Projects
  • Design Challenges
  • Upcycle It
  • More
  • Cancel
Upcycle It
Blog [Upcycle It] Nixie Display #11 - Display blog views and likes
  • Blog
  • Forum
  • Documents
  • Polls
  • Files
  • Events
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: gpolder
  • Date Created: 26 May 2017 1:58 PM Date Created
  • Views 1380 views
  • Likes 6 likes
  • Comments 9 comments
  • upcycle_it
  • nixie
  • node.js
  • nixie_tube
  • upcycled_nixie
  • web scraping
  • intel edison
Related
Recommended

[Upcycle It] Nixie Display #11 - Display blog views and likes

gpolder
gpolder
26 May 2017

<< Previous

Blog IndexNext >>

Time is going fast, and the project submission deadline approaches quickly. In order to finish the project as complete as possible I had a quick look at what the original plan was in my application:

to add the IntelRegistered Edison to this display, in order to display a six digit number using IntelRegistered Edison's wifi connection connected to the internet. An IoT nixie display so to say. The number displayed can be anything, of course it can be the current local time, from a time server, or the local time elsewhere in the world. It can be the temperature and humidity of the closest weather station, or the forecast for tomorrow. It can be the number of visitors of the project webpage, the internet speed, the position of the ISS space station from space.com, you name it.

 

Last week I wrote software to display time, date, temperature, humidity, pressure and rainfall on the nixie tubes. The time and date directly came from the Edison itself, which on his turn is updated by a standard time server. Weather information is  retrieved from the OpenWeatherMap service using a standardised API.

As you can see in the application I promised to display some arbitrary information, for which no API or protocol is available. The technology to do this is called Web Scraping. As an example I choose to display of the number of views and likes of my most recent blog post in this challenge.

 

Web Scraping with Node.js

Two modules are very usefull for web scraping with Node.js. While Node.js does provide simple methods of downloading data from the Internet via HTTP and HTTPS interfaces, you have to handle them separately, to say nothing of redirects and other issues that appear when you start working with web scraping. The Request module merges these methods, abstracts away the difficulties and presents you with a single unified interface for making requests. We’ll use this module to download web pages directly into memory.

Cheerio enables you to work with downloaded web data using the same syntax that jQuery employs. To quote the copy on its home page, “Cheerio is a fast, flexible and lean implementation of jQuery designed specifically for the server.” Bringing in Cheerio enables us to focus on the data we download directly, rather than on parsing it.

First step is to install both modules:

root@edison_nixie:~# npm install request
request@2.81.0 node_modules/request
root@edison_nixie:~# npm install cheerio
cheerio@0.22.0 node_modules/cheerio

Next we need a web link to retrieve the information from. This web page is of course the content page  from the Upcycle_It challenge:

https://www.element14.com/community/community/design-challenges/upcycleit/content

All the 'Upcycle Nixie Display' blogs are tagged with upcycled_nixie so by selecting this tag we only will get the once we need here.

And finally we put the most recent blog on top, by selecting 'Sort by date created: Newest first'.

 

image

 

Now the url is copied from the address bar and put in the code:

 

// Web scrapping
var request = require("request"),
    cheerio = require("cheerio"),
    url = "https://www.element14.com/community/community/design-challenges/upcycleit/content?filterID=contentstatus%5Bpublished%5D~language~language%5Bcpl%5D&filterID=contentstatus%5Bpublished%5D~tag%5Bupcycled_nixie%5D&sortKey=contentstatus%5Bpublished%5D~creationDateDesc&sortOrder=0";

Finally I wrote a function to retrieve the information from the website:

 

 

function getBlogCounts() {
        request(url, function (error, response, body) {
            if (!error) {
                var $ = cheerio.load(body);
                var views = $('td.j-td-views').children().first().text();
                var likes = $('td.j-td-likes').children().first().text().split(' ')[1];
                console.log('Views:', views, 'Likes:', likes);
                blog_counts = views * 1000 + likes * 1;
            } else {
                console.log("We’ve encountered an error: " + error);
            }
        });
    }


    getBlogCounts();
    setInterval(getBlogCounts, 600000); // interval of 10 minutes

 

 

So, what are we doing here?

First, we use the Request module to download the page at the URL specified above via the request function. We pass in the URL that we want to download and a callback that will handle the results of our request. When that data is returned, that callback is invoked and passed three variables: error, response and body. If Request encounters a problem downloading the web page and can’t retrieve the data, it will pass a valid error object to the function, and the body variable will be null. Before we begin working with our data, we’ll check that there aren’t any errors; if there are, we’ll just log them so we can see what went wrong.

If all is well, we pass our data off to Cheerio. Then, we’ll be able to handle the data like we would any other web page, using standard jQuery syntax. To find the data we want, we’ll have to build a selector that grabs the element(s) we’re interested in from the page. If you navigate to the URL I’ve used for this example in your browser and start exploring the page with developer tools, you’ll  see the folowing:

 

image

 

 

Notice that the number of views are selected with td.j-td-views, while the  likes are selected with td.j-td-likes. We will get a whole list of those, one for each blog post. Therefore we select only the first one (children().first()), which is the most recent blog in this case. The likes selector returns a string and a number, in this case: "Show 7 likes7" from which the first number is selected by splitting the string and taking the second element (.split(' ')[1]).

Finally, now that we’ve got ahold of our elements, it’s a simple matter of grabbing that data and showing it on the six digits nixie tubes by adding the number of likes to the multiplication of the number of views by 1000. The function for grabbing the information from the blog list will run at an interval of ten minutes. The showAll loop is enhanced with an extra entry for showing the information.

 

 

               case 6:
                    showNumber(blog_counts);
                    break;

 

As proof of the pudding here is an image of the display showing the view count (123) and 5 likes for [Upcycle It] Nixie Display #10 - Software stuff at the time of this writing (Fri May 26 15:12:27 CEST 2017).

 

image

 

Works like a charm image. For displaying other information such as the internet speed, the position of the ISS space station from space.com, or whatever you want, the software can be adapted accordingly.

 

Updated function table

The table below shows an update of the available functions with their units and the number format.

Number

Function

Format

0Timehhmmss
1DateYYMMDD
2Temperature°C0000CC
3Humidity%0000HH
4PressurehPa00PPPP
5Rain Volume last 3HRRRRRR
6Blog views and likesVVVLLL

 

The full source is on GitHub: https://github.com/AgriVision/nixie_display. In this code  the URL of the blog index page is moved to settings.json which is a better place then main.js.

 

This finishes the blog of this week. I will keep an eye on my nixie display to be informed on the views and likes.

Next week is for some final tweaks and wrap up.

 

stay tuned!

  • Sign in to reply

Top Comments

  • gpolder
    gpolder over 8 years ago +4
    First thing I saw this morning when entering my workspace : In the mean time I also fixed a software error. The line: blog_counts = views * 100 + likes ; Showed the expected value (075006), although the…
  • balearicdynamics
    balearicdynamics over 8 years ago +3
    Great post! I love always more the nixies Take a look, what do you think to a hack to this '80s train station clock (don't ask me how I got it) with some more tech info with a set of nixies in the middle…
  • carmelito
    carmelito over 8 years ago +3
    Awesome post, this has got me thinking. I could do something similar using python-beautifulsoup , but looks like time is not on my side. Maybe, I will try this after the challenge. Thanks - Carmelito
Parents
  • dougw
    dougw over 8 years ago

    Super cool - bonus points and kudos for scraping the likes

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
Comment
  • dougw
    dougw over 8 years ago

    Super cool - bonus points and kudos for scraping the likes

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
Children
No Data
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube