|
|
|
|
So I scraped 450k amazon product urls, and now i finally finished writing my scraper and finally kicked it off with some fresh proxies. downloading massive amounts of data and images from the big A hole |
|
|
|
There are no conversations. |
|
|
cauz |
May 14, 2017, 2:30 p.m. |
|
|
|
Jason Isaacs |
I imagine like most of us that I'd like obscene amounts of money but the people I met and worked with who have those obscene amounts of money and have obscene amounts of fame have awful lives. Really. I mean hideously compromised lives. And I can go anywhere. No one knows who I am. |
Jason Isaacs |
I imagine like most of us that I'd like obscene amounts of money but the people I met and worked with who have those obscene amounts of money and have obscene amounts of fame have awful lives. Really. I mean hideously compromised lives. |
David Hockney |
It's time to debate images, especially when someone's going to prison for downloading them. |
Leos Carax |
My films start with images, a few images and a few feelings, and I try to edit them together to see the correspondence between these images and these feelings. |
Randy Bachman |
To add an AC outlet, for example, you just drill a circular hole in the wall, tap into the wiring, add the outlet and you're set. If you don't want it, pull it out and plaster over it with more earth to seal the hole. |
Merle Haggard |
It's been said that Bill Gates has come up with something that'll be released in December that's gonna put a lid on counterfeiting. If that's a fact then it's really interesting to own your own product - with all the potential methods of downloading. |
Laurell K. Hamilton |
I went to Marion College for writing and I was kicked out of the writing school. I was asked to leave the writing program because I was corrupting the other students. |
Jonathan Ive |
When you're trying to solve a problem on a new product type, you become completely focused on problems that seem a number of steps removed from the main product. That problem solving can appear a little abstract, and it is easy to lose sight of the product. |
Jesse James Garrett |
But despite the universality of URLs, we often forget that they're not just a handy way to address network resources. They're also valuable communication tools. |
Truman Capote |
No one will ever know what 'In Cold Blood' took out of me. It scraped me right down to the marrow of my bones. It nearly killed me. I think, in a way, it did kill me. |
|
|
have more than a half million product urls (which is really the hard part with amazon, they make it extremely difficult for scrapers to crawl their entire site). after cleaning up this list and potentially trying to get even more products, i will continue to modify my php scraper, this time with use for amazon. it rotates through proxies and user agents so it has worked well in google maps, yelp,. and your university's student directories, so it should bypass amazons no problem. my scraper nowadays saves all the data into xml so i can import through certain plugins, but also have a super easy way to convert to any form i need. originally my scraper rotated through tor proxies and saved all data directly into mysql, over time i created sql files for importing and now that wordpress is used so extensively and doesnt recieve penalties in the search engine like it used to, i can just throw all the data in there and make as many copies and variations of the sites as i want. and make it loo...
This post is a comment.
|
|
|
|
i bet i could scrape the images using scrapebox with free proxies to save on costs. the only reason i used paid proxies for the data is because i want to be sure that it's US data to get US results for each product id. and theyre more reliable
This post is a comment.
|
|
|
|
This list of 400k product ids include lots of copies of the same product with a different tracking number. im only getting maybe 30k off that list total. was gonna scrape more after so my next run of my id gathering, ill find better ways to remove redundancy and save some money. ive used almost 30g of bandwidth through those proxies the past few days. but i also download huge high rez images too
|
|
|
|
now that my list of product ids is in the millions and ive used about 40gb of proxy bandwidth scraping maybe 50k pages from that data, i have to carefully weigh out how much i want to spend on proxies (spent about $30) on this experiment that could result in just a simple takedown notice to stop the method. granted i can always reuse and modify this data. but i guarantee if you had a million page site based directly around real ecommerce products you would make good money if it stays up
|
|
|
|
Thinklynx is finally back! My thoughts have been so disconnected but now I can finally piece them together again. Why aren't there more novelty vehicles on the road? Are we really supposed to believe the difference between an IQ of 100 and 70 is the same as between an IQ of 130 and 100? Should I cancel my plans with a friend who is pretty late by this point or just wait it out? Should legal text even be called English?
I've missed you, Thinklynx.
|
|
|
|
Think about a fake blood or spit sample with malware tho. They've been encoding massive amounts of data into DNA for awhile but if it translates over to anything more used than DNA it could be a huge issue for any court or hospital. Crazy shit tho. So specific but just think about 20 years from now it could be a real attack vector
This post is a comment.
|
|
|
|
Amazon Is Finally Profitable, Earns $2.5 Billion Over the Last Three Months
Amazon topped $2 billion in quarterly profit for the first time in its history, an impressive run fueled by continued growth in Prime subscriptions, cloud computing and its nascent advertising business. Amazon said Thursday that it earned $2.5 billion in profit for the three months ending in June, a staggering jump from the $197 million it posted in the same period last year. It marked the third consecutive quarter that Amazon has topped $1 billion in profit, a remarkable feat for a company once known for investing so much in its business that it often lost money. "The profitability trajectory appears to be accelerating quicker than expected," Daniel Ives, an analyst with GBH Insights, wrote in an investor note ...
|
|
|
|
Amazon Opens Up Its Internal Machine Learning Training To Everyone
Amazon announced Monday that it's making the machine learning courses it uses to train its engineers available to everybody for free. The course is tailored to four major groups -- developers, data scientists, data platform engineers and business professionals -- and it offers both foundational level lessons as well as more advanced instruction.
https://aws.amazon.com/blogs/machine-learning/amazons-own-machine-learning-unive...
|
|
|
|
This is a review of the new JMT album on Amazon:
I was waiting for the train having a cool blunt session while listening to the new Jedi Mind Tricks album, and just my luck, some torn skinny jean, tight neon polkadot T-shirt, sugar shoes wearing, hot Cheetos hair rocking weirdo is killing my vibe by signing Drake songs out loud. At first I asked him nicely to keep it down cuz I just got off probation and I didn’t want to catch a case, but the clown kept on mumble rapping and signing. Then out of no where, in a high pitch estrogen like feminine voice, he starts talking about how Drake is the best rapper alive and I’m hating on him cuz I’m a hater, Hahahaha!!! I tried to pay him no mind, so i turned up the volume on my headphones, and kept head nodding to the music, but the guy obviously wan...
This post is a comment.
|
|
|
|
Referring a user to amazon through your affiliate link gets you 24 hours to 90 days tracking cookie where you can earn commission on anything the user purchases in the time period from Amazon, the biggest online store in the world. 1 million product pages will bring long tail search traffic and careful analytics will reveal the most promising niches/products which new hyper focused niche sites can be created around.
This post is a comment.
|
|