David Martin Riveros

David Martin Riveros: How to scrape Web Data at Scale

Most small e-commerce businesses face an uphill battle against corporate giants who hoard competitive intelligence. While public data exists everywhere online, extracting it remains a technical challenge that favors companies with deep pockets and advanced resources. David Martin Riveros recognized this disparity three years ago when he founded Icebergdata, creating tools that level the playing field for smaller competitors seeking market insights.

Democratizing Data Access for Competitive Insights

The problem starts with a simple reality that most business owners don’t think about. “Not everybody has access to public data,” David explains, describing the core mission behind Icebergdata. His team has spent three years helping brands extract valuable datasets from websites, turning what was once an exclusive corporate advantage into an accessible business tool. The stakes are higher than most people realize. Large corporations deliberately make data collection difficult because they understand its power. “When you have access to public data, you could easily level the playing field and eliminate information asymmetries that corporations use against small e-commerce players,” David notes. This isn’t just about having more information. It’s about survival in a marketplace where knowledge directly translates to profit margins and market share.


Smart data collection opens doors that most small businesses don’t even know exist. Companies can capture competitor pricing, track inventory fluctuations, and analyse historical product reviews to make better strategic decisions. David points out that businesses can “optimise pricing strategy as well as run very competitive promotion campaigns that would definitely unlock market share for a small brand” This kind of intelligence gathering was once reserved for companies with massive IT budgets. Small businesses either paid exorbitant fees to third-party services or simply operated blind to market conditions. The difference between knowing your competitor’s pricing strategy and guessing can mean the difference between profit and loss in today’s tight-margin environment.

Use AI Assisted Parsing

Web scraping sounds straightforward until you actually try it. The first major obstacle comes from websites that constantly shuffle their layouts. “Modern websites randomly reshuffle their layout. As a human, you may not notice these subtle changes. However, a small change on the layout of a website could completely break an average scraping bot,” David explains. A product price might jump between different positions on a page, leaving traditional scrapers confused and useless.


The solution involves something David calls “self-healing scrapers”. When parsing fails, the scraping bot will call an AI assistant. This AI assistant becomes aware of the new website structure, and this will trigger the AI to suggest multiple patterns.” The system becomes smarter over time, learning to adapt when websites change their structure instead of simply breaking down.

Building for Scale and Responsibility

Getting data once is easy. Getting it consistently over months or years, while respecting website performance, is much harder. David emphasizes that production systems need to handle thousands of requests every day, recover from failures, and follow proper protocols. “A production-grade pipeline has to do it thousands of times a day, consistently throughout the year,” he explains. Icebergdata’s approach is built around that challenge. Their system runs jobs in parallel, pauses when a site shows signs of stress, and adapts when the structure of a page changes. This keeps data flowing, ensures infrastructure costs stay predictable, and allows websites to continue serving their users without disruption.


All this technical work supports a much bigger goal. “Data collection should be for everybody,” David says. Right now, only companies with large tech budgets have access to the kind of market intelligence that drives smart decisions. He wants to change that. The tools already exist. Smaller businesses shouldn’t have to rely on guesswork when it comes to competitor pricing or market trends. They just need access to the right data and a responsible way to collect it.


Follow David Martin Riveros on LinkedIn or check out Iceberddata to learn how data access can reshape small business strategy.

Total
0
Shares
Prev
Lori Muller: Women in Leadership—Breaking Barriers and Redefining Real Estate
Lori Muller

Lori Muller: Women in Leadership—Breaking Barriers and Redefining Real Estate

Next
Maman Ibrahim: Streamlining Third-Party Management to Enhance Security​
Maman Ibrahim

Maman Ibrahim: Streamlining Third-Party Management to Enhance Security​

You May Also Like