element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet & Tria Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • About Us
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Raspberry Pi Projects
  • Products
  • Raspberry Pi
  • Raspberry Pi Projects
  • More
  • Cancel
Raspberry Pi Projects
Blog Use Raspberry server farm for crawling / parsing webites on a larger scale
  • Blog
  • Documents
  • Events
  • Polls
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Raspberry Pi Projects to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: Former Member
  • Date Created: 13 Jun 2015 4:14 PM Date Created
  • Views 755 views
  • Likes 0 likes
  • Comments 2 comments
  • raspberry
  • farm
  • websites
  • parsing
  • rpiexpert
  • crawling
  • raspberry_pi_projects
  • server
Related
Recommended

Use Raspberry server farm for crawling / parsing webites on a larger scale

Former Member
Former Member
13 Jun 2015

Hello forum,

 

being totally new to the Raspberry world, I hope to get a few comments and answers to my (maybe crazy) thoughts about using a Raspberry farm instead of Intel i7 systems to serve as a crawling and parsing system.

 

Application background: I have developed my own crawling  (internally using wget) and very specialized parsing software. The goal is to permanently crawl and parse a couple of 10 million websites websites for special purposes (very different from normal full text searches). It currently runs on OpenSuSe Linux on two Intel PC i7 3770 and 4770 nearing the end of beta tests. In the end I will need about 7-10 such PC to run all tasks permanently.

 

Since I must work on self-financed low-budget, I had the idea of employing a farm of Raspberry systems to handle these tasks. The software could be ported although this would mean some efforts, which must be justified.

 

However, this would only make sense if most of the following expectations could be fulfilled by a Raspberry server farm:

 

1) The initial hardware investment related to the same processing level must be less than 50% of what Intel based PCs would cost to justify the additional hassle, handling and software port.

2) The power consumption should be significantly lower.

3) Reliability should reach around 80% of the one of the Intel systems (i.e. somewhat less could be accepted).

4) Crawling / downloading could be separated from parsing. It would be sufficient if parsing was delegated to Raspberries. Crawling / downloading uses RAM disk as intermediate storage because the SSD drives quickly get to their limits and this might therefore better stay on the 32 GB RAM Intel PCs.

 

What do the Raspberry experts say?

 

a) Forget about it!

b) A viable concept worth considering.

c) Cost saving will not be significant enough.

d) Or what else?

 

Thank you very much in advance for your comments.

 

Best regards

FrankB

  • Sign in to reply
  • johnbeetem
    johnbeetem over 10 years ago

    While the RasPi 2 is a lot faster than the RasPi 1 and is viable for some desktop applications, it does have limited processing power, memory, disk I/O, and networking performance.  Whether RasPi 2 has enough computing power to handle your application depends on what's limiting your performance.  If you're compute-bound, RasPi 2 will probably disappoint you.  If you're using a lot more than 1 GB RAM, RasPi 2 could be a disaster.  If your application needs SATA disks to get decent performance, RasPi 2 will disappoint: it only has SD cards or USB 2.0, both of which are slow compared to SATA.  If you need 100s of Mb/s of Ethernet (or GBE), RasPi's USB-based Ethernet will disappoint.

     

    OTOH, if your application is limited by your external Internet connection, then it may not matter that much how much processing power your CPUs have if they spend most of their time waiting for data from the Internet.

     

    If your application involves a lot of image processing, RasPi might perform quite well.  That GPU is very powerful if your can find or write the software you need to make use of it.

     

    So these are the tradeoffs I can think of.  I suspect RasPi 2 won't have the performance to do what you want and you'd be better off with PCs. Alternatively, you could look into much higher performance SBCs such as the US$100 ODROID-XU3 Lite which has eight ARM cores (four 1.8 GHz Cortex-A15, four Cortex A7), 2 GB DDR, USB 3.0, and MMC.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • DAB
    DAB over 10 years ago

    One thing I learned early about computers is that if you have a vision and persistence, you can do most anything.

     

    So I think that with some good system design, you should be able to make your server work.

     

    After you get it functional, then you can do performance testing to see where better solutions could make it work faster and more reliable.

     

    DAB

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube