this post was submitted on 16 Sep 2023
92 points (95.1% liked)

Canada

7078 readers
517 users here now

What's going on Canada?



Communities


🍁 Meta


πŸ—ΊοΈ Provinces / Territories


πŸ™οΈ Cities / Regions


πŸ’ SportsHockey

Football (NFL)

  • List of All Teams: unknown

Football (CFL)

  • List of All Teams: unknown

Baseball

Basketball

Soccer


πŸ’» Universities


πŸ’΅ Finance / Shopping


πŸ—£οΈ Politics


🍁 Social & Culture


Rules

Reminder that the rules for lemmy.ca also apply here. See the sidebar on the homepage:

https://lemmy.ca


founded 3 years ago
MODERATORS
 

It would be nice to be able to bring to light the price gouging that is taking place in Canada with regards to grocery stores.

all 25 comments
sorted by: hot top controversial new old
[–] [email protected] 18 points 11 months ago (1 children)

The project is Open source, so you might be able to leverage it for Canadian data. All you need is:

  • Understanding of the expected format for the project
  • Access to data from Canadian retailers. This can be acquired via APIs (these are usually free) or by scraping their sites.
[–] [email protected] 12 points 11 months ago (1 children)

At the bottom of the chain on mastodon the creator says they use the search APIs of the store websites. I wouldn't have expected those to be easily accessible!

[–] [email protected] 4 points 11 months ago

Yeah a lot of chains even have a documented, developer-friendly API. If that's not available though, you can usually figure out the API just by looking at the calls your browser makes when visiting a page. Most sites use a REST API for catalog pages that's then rendered out with JavaScript.

If that doesn't work, then you can usually scrape everything with Selenium. It's a little harder to do, but still quite manageable, though that usually has to be a background job, as it's slow.

[–] [email protected] 15 points 11 months ago (2 children)

The issue with this sort of thing is primarily one of data entry, rather than "tech savvy" as such. Defining the database is easy compared to getting the data in there.

Quick options would include parsing the information out of the stores' websites (possible, but if Javascript is involved you may be looking at puppeting a browser with Selenium, which isn't fast and can get tedious, and the approach depends on the websites being complete, accurate, and up-to-date), or hacking or snooping on the stores' own mobile apps (if they have them) to get price information in a usable format. Approaches like this are inherantly brittle, as even trivial changes made from the grocery chains' end can cause them to break. Scraping information without a defined API or the cooperation of the owner of the data is a moving target. From experience, I can tell you that it gets annoying fast.

In the case of the Austrian government, they probably wanted that cooperation and defined API. Which would have required careful negotiations with each company and paid programmers looking at the corporate databases. That would have increased their cost and lengthened their projected timeframe. Corruption and corporate greed did the rest.

[–] [email protected] 7 points 11 months ago

They use the search APIs of the grocery store websites.

[–] [email protected] 1 points 11 months ago (2 children)

So essentially it was possible over there due to proper/favourable conditions, whereas here it would be much more difficult?

[–] [email protected] 6 points 11 months ago

the responsible minister claimed it's an immense task and will take til autumn. It willΒ Β only include 16 product categories (think flour, milk,etc.). And it will only be updated once a week.

I mean that's pretty pathetic. Better than nothing, but "only updated once a week" sounds like "the intern who has to enter the prices works only for 20 hours", not like they created an API and told the grocery chains to upload their prices.

[–] [email protected] 3 points 11 months ago

Unknown. I don't use the grocery chains' websites (I'm of the "go to the nearest physical store and figure it out once there" persuasion), so I don't know what the complexity level would be. It's possible that they're all older-school sites where you can lift the data straight from the HTML, which is relatively fast.

[–] [email protected] 12 points 11 months ago (2 children)
[–] [email protected] 2 points 11 months ago

Are we looking to expand what’s here on the Grocery Tracker to incorporate what they are doing with the Austrian site?

I’d also like to look at other pinch points of government heel dragging. Housing, energy, medical, transportation, telecom, news etc. We all see these government contracts go out for seven figures and it’s always shown to be blown out of proportion.

A nice added bonus to the project in Austria was someone giving historical data. It would be great to have a similar leg up for Canada.

[–] [email protected] 1 points 11 months ago

Nice find! This looks like exactly it, but Canadian.

[–] [email protected] 7 points 11 months ago (2 children)

With the advent of e-ink price tags, it wouldn't be surprising to me if something similar goes on with the prices of generic and medium tier items.

We'd have to see if an API is available somewhere first.

[–] [email protected] 7 points 11 months ago* (last edited 11 months ago)

Even without an API it should be possible, in theory, to just parse the data directly from their websites.

This also gives the grocery stores less of a leg to stand on in terms of legal or practical recourse. They chose to create a publicly browsable database of their prices; all you're doing is browsing it.

[–] [email protected] 4 points 11 months ago (1 children)

Huh. Now you've got me wondering... Could you leave a device hidden in store that receives the IR signals that program the tags, capture that information, then parse it out later? You could literally log prices changes at the shelves, in real time.

[–] [email protected] 3 points 11 months ago (1 children)

I don't think they're IR. Something similar to wifi. Surely encrypted.

[–] [email protected] 3 points 11 months ago

Looks like there are two varieties - IR or Bluetooth Low-Energy (BLE).

[–] [email protected] 3 points 11 months ago (1 children)

Would a system identifying products from a recipt work for this? combined with other data sources (like web scraping) it would make it a lot easier to crowdsource the data, even if only sortaa technically inclined people do it

[–] [email protected] 4 points 11 months ago (2 children)

It would help to some extent, but to really get people to buy in you'd need an app to do the heavy lifting (that is, it's easier to get people to snap a photo of their receipt than to type the info in one character at a time). Some people might still be willing to do it without, but how many?

You'd also have to relate the abbreviations that often appear on grocery receipts back to the items they represent, which is more data entry.

[–] [email protected] 2 points 11 months ago

Would stores attempt to hire teams to put in junk data

[–] [email protected] 2 points 11 months ago

Yeah, I was thinking something along the lines of lots n lots of easy shitty data (ex. anyone who can take a picture), some pretty good data (ex. hand labeled receipts), some 100% reliable data (scraped/api) then some sort of system to correlate the 3, especially when prices match identically between receipt and api a fair sized database could create itself.

Also would need some sort of processing center to handle the many image processing requests, but maybe that could be done client side

[–] [email protected] 3 points 11 months ago

Not exactly what you're looking for, but check out this Marketplace piece:

https://www.cbc.ca/news/business/marketplace-shrinkflation-1.6654780

[–] [email protected] 3 points 11 months ago

If you do get started with this, I'd love to follow along and find a place where I can help. If you guys make a community or mastodon account for example, please link it :)

[–] [email protected] 1 points 11 months ago

Can't, this would be illegal within a year. Scraping data is already taboo. How fucking dumb is that.

Its why I hate the ' starving artist worried about AI scrapping' stories. It will be used to usher in stronger laws to prevent us from scraping this data. Its a double edged sword.

[–] [email protected] 1 points 11 months ago

Pretty easy to get something basic set up if you get enough people to crowd-source data with photos of stuff in grocery stores and their receipts, along with some scraping to get data that's available online. It's a project that's been on my backlog for a while, but I can bump it up if others want to join me in making this.