this post was submitted on 08 Sep 2024
35 points (90.7% liked)

Data is Beautiful

1167 readers
2 users here now

Be respectful

founded 5 months ago
MODERATORS
 

Collected US 2024 tech job postings from Indeed and embedded them with Open AI text embedding large. Reduced dimensionality and clustered via UMAP and HDBSCAN. Topic modeled with Open AI chat API. Visualized with DataMapPlot. Github pages https://hazondata.github.io/ has full interactive map. I also have real-time insights into tech job postings on my site hazon.fyi

https://old.reddit.com/r/dataisbeautiful/comments/1fakvwv/oc_clustering_250k_tech_job_postings_in_2024/

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 5 points 2 months ago

no wonder it was taking long to load; it's a 58MB HTML file.

really cool stuff though - I'd love to see more information of what's on the screen:

  • Number of postings (updated when filtered using the search);
  • Some way to visualize posts in the intersection of these clusters e.g. Software Dev with Education; AI and DevOps.
  • Word cloud of most common terms in the posting selection;
  • Ways to export the filtered data.