Manny's Reviews > Machine Learning Techniques for Text: Apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Machine Learning Techniques for Text by Nikos Tsourakis
Rate this book
Clear rating

by
1713956
's review

really liked it
bookshelves: chat-gpt, science, what-i-do-for-a-living, received-free-copy

I shared an office with Nikos for years, and we wrote a bunch of joint papers, where often I would contribute something based on classical computational linguistics methods and Nikos would contribute something based on machine learning; I think the one I'm most pleased with is this effort from 2017. Nikos knew the machine learning toolkits very well, and he was always able to get things done with great ease, so I never got around to acquiring these useful skills. Now I've just left the University of Geneva and moved to the University of South Australia, and I no longer have direct access to his expertise; but, with excellent timing, Nikos has published this book. It's almost like having him back on the other side of the office again.

Some books on machine learning are full of matrix algebra and partial derivatives, but there's little of that in Nikos's book: he's a hands-on kinda guy, and the book is constructed in a hands-on kinda way. He organises the text around ten case studies, where each time you want to do some kind of data analysis task which involves machine learning, and he walks you through a solution using scikit-learn, pytorch, keras, pandas, matplotlib and the other Python libraries. The examples are engaging: detecting spam emails, classifying newgroup posts, recommending music titles, and the like. By the end, you're using advanced deep learning techniques to do things including building nontrivial chatbots.

It's amazing how powerful and versatile these packages have become. You just type 'pip install' plus the name of the package, and two minutes later you have something sitting on your machine which was cutting edge software only ten years ago, in some cases less than that: you can organise data, train a model, apply it, visualise the results. The packages are neatly set up with easy-to-use interfaces that let you do everything with a few simple commands. Nikos walks you through it, and you see how straightforward it is once you've mastered the tricks. These days (the book came out just before ChatGPT), it's become even easier: once you know something is possible, you can ask Chat for the invocations, and it'll generally be able to give you a solution that either works, or is close and can be fixed after a bit of discussion. The possibilities it opens up are staggering.

Many times, I was reminded of Flynn's thought-provoking book What Is Intelligence? , which I read back in 2010. Flynn discovered the effect that bears his name, according to which IQ scores steadily rise by a few points a decade. People have questioned the validity of the Flynn Effect: are we really getting smarter at that rate, how can such a thing be possible? Flynn argues persuasively that it's actually not mysterious at all. A large part of intelligence, he says, is about collecting together a better mental toolkit. As time progresses, more and more useful tools are developed, and it's easier and easier to acquire them. Machine learning is a striking example. Not long ago, being able to use these techniques would have made you an exceptionally smart person. Now, it's just a bunch of tricks that anyone can pick up with a little diligence. But the tricks are no less powerful just because they're easily accessible. Get Nikos's book or a similar one, read through it, experiment a bit under ChatGPT-4's friendly guidance, and you'll feel measurably smarter. Maybe you'll actally be measurably smarter. Why don't you find out?
43 likes · flag

Sign into Goodreads to see if any of your friends have read Machine Learning Techniques for Text.
Sign In »

Reading Progress

June 21, 2023 – Started Reading
June 21, 2023 – Shelved
June 21, 2023 –
page 30
6.7% "Thank you Nikos! Just started..."
June 27, 2023 –
page 100
22.32% "A handy rule of thumb: if your feature set is to be useful, you need at least five training examples for every feature."
July 7, 2023 –
page 200
44.64% "Systems of this kind try to identify similarities between customers based on past behaviours and people with similar purchase habits can recommend products to each other. The benefit, in this case, is that customers are exposed to items in which they have never expressed any explicit interest."
July 12, 2023 –
page 280
62.5% "The more content that is available online, the less easy it is to discover and consume the most important information efficiently. Automatically extracting the gist of longer texts into an accurate summary and thus eliminating irrelevant content is urgently needed. Once again, machines can undertake this role."
July 15, 2023 –
page 305
68.08% "In the next section, we stand on the knowledge accumulated so far to dive deeper."
July 17, 2023 –
page 385
85.94% "Try to think of the last time you contacted the call center of a company, where an automated system probably answered your call. Replacing the human factor presents many competitive advantages in terms of cost and availability. However, these systems do not fully incorporate the communicative behaviours humans use and hence are limited in reaching their full potential."
July 19, 2023 – Finished Reading
July 23, 2023 – Shelved as: chat-gpt
July 23, 2023 – Shelved as: science
July 23, 2023 – Shelved as: what-i-do-for-a-living
July 23, 2023 – Shelved as: received-free-copy

Comments Showing 1-2 of 2 (2 new)

dateDown arrow    newest »

message 1: by Théo d'Or (new)

Théo d'Or Interesting challenge. All my life I dreamed of being
measurably smarter. Today more than yesterday, but less than tomorrow.


Manny Dream no more!


back to top