ScroogLyrics

Amarok 1.x Scripts

Source (link to git-repo or to original if based on someone elses unmodified work): Add the source-code for this project on git.opendesktop.org

0
Score 50%
Description:



********************
OBSOLETE
********************
Get GoogLyrics: http://kde-apps.org/content/show.php?content=73850






NOTICE
* DO NOT USE THIS SCRIPT RIGHT NOW *
An update will be available soon that will fix the current Scroogle issue. Sorry for any inconvenience. A renaming might be inorder, so, watchout for a Googlyrics

I've been annoyed by the low yield on some of my more obscure music with many of the Amarok lyric search scripts available. I'd used a program waaay back when in my windows days called EvilLyrics, they had an interesting idea, Google for the lyrics then rip them off known websites. Well, I decided this is what Amarok needed! After a quick day of scripting, here's the result: a true lyrics metasearch script. Since google's SOAP API is crap (now doesn't allow you to register keys and returns dramatically different results to web search), I decided I'd scrape. Since Scroogle had a cleaner markup, I decided it'd be even better to scrape then just Google (and the paranoid might like me more or something). Anyways, the script searches scroogle (hence the name Scrooglyrics) for the song lyrics, then pulls them off the known sites. Adding sites should be easy so, tell me what sites you'd like to see and I'll be sure to get them in next release! Remember, these sites should be reputable ones that show up in the first 20 results on google for your song's lyrics.

To use this script you must have the WWW::Mechanize perl module installed. Your distrobution probably has a package for it, but, in the case they don't, you can do this the perl way, su to root then use cpan to install each of them, ex:
$su
#cpan
cpan>install WWW::Mechanize

---- OR ----

If you're on ubuntu or debian, you can simply apt it with this command:
sudo apt-get install libwww-mechanize-perl

I'd like as much input on this as possible, so, if you've tested it, please comment! Your opinion is wanted :)

If you're able to, please report bugs directly to our bugtracker at:
http://udmp.info/mantis

Last changelog:

11 years ago

0.11:
- Major rewrite
- Now easier than ever to add lyrics sites or disable them in the source (options screen coming soon)
- enhanced whitespace cleaning.
- overall cleaner code
- Now spoofs user agents to IE6 on Windows to work around any weak blocking attempts.

0.10:
- Multiple search queries to find song, starts most accurate, works down to least (this should remove errors where an incorrect page is ripped)
- Added support for themadmusicarchive.com
- Updated sing365 regex

0.9:
- Supports lyriki.com
- Supports lyricspy
- Now supports local lyrics!
To use local lyrics, make a folder called "lyrics" in your home directory and store files named either as Title.txt or as Artist - Title.txt
Note that these are case sensitive, so be careful before you say it doesn't work.

0.8:
- Several major changes UPGRADE STRONGLY RECOMMENDED!
- Script will now fail properly if no lyrics are found
- Now fails properly when there's no connection
- Fixed several regex's which were not working properly
- Whitespace problems are now solved for several sites

0.7.4:
- No more crashing on connection problems
- Now removes parts of title in parenthesis.
- Added in README and COPYING for about dialog

0.7.3:
- Fixed bug in artists or song name containing special characters.

0.7.2:
- Fixed bug where artist name is not sent in search request

0.7.1:
- Added some extra debugging code
- Fixed lyrics007 linebreak bug
- Now removes starting "The" in artist names, seems to get better results

0.7:
- Removed depenencies on HTML::Entities and HTML::Strip, just needs Mechanize now.

0.6:
- BIG Update!
- Now searches songmeanings.net and wearethelyrics.com
- now only googles for results in sites needed
- now will keep looking if a regex fails instead of dying
- big thanks to mattepiu and neoeno for their work towards this version, great job!
- Stay tuned for a version that will include modules.

0.5.1:
- Fixed capturing for actionext.com and azlyrics.com
- Added a little extra debugging, if anyone's having problems, pull up the output window for the script and see if a regex has failed.

0.5:
- Fixed packaging error

0.4:
- Added letssingit.com and lyricwiki
- code cleanup, now passes use strict
0.3:
- Bug fix :) No more last-lyrics bug. I hope.

0.2:
- Added sign365.com support. I'll be working on that last-lyrics bug for next release.

0.1:
Initial release, currently has support for 3 lyrics search websites, azlyrics.com, lyrics007.com and actionext.com. This will probably be expanded later, but, I easily find most of my more obscure songs with just these 3! This my friends is why we need a lyrics metasearch.

C

ultramancool

11 years ago

Well everyone, it seems I'm completely firewalled out of scroogle, nice work, eh? Don't worry, I'll release an update by tommorow to fix this up. Sorry for the inconvenience.

Report

PhobosK

11 years ago

Hi, First let me thank you for the nice lyrics script.
Now to the problem - it worked for a while but it started to give an error in the WWW::Mechanize module (at the beginning happened randomly but now it is preventing the script from working as soon as any song is played). That is why I upgraded to 0.11 version but the problem still exists. More info:WWW::Mechanize tested with 1.20 (the original version in Gutsy) and 1.34 (latest version from CPAN) Amarok 1.4.7 Unicode system Nothing in the output script window The error is:
(WWW::Mechanize 1.20)Can't call method "value" on an undefined value at /usr/share/perl5/WWW/Mechanize.pm line 1107, <STDIN> line 1.

(WWW::Mechanize 1.34) Can't call method "value" on an undefined value at /usr/local/share/perl/5.8.8/WWW/Mechanize.pm line 1247, <STDIN> line 3.

PS. BTW the bug report system requiring sign-up is not very convenient for the users i think, but anyway it is your decision so i do not judge you.

Report

C

ultramancool

11 years ago

I'm using WWW::Mech 1.34 myself (and amarok 1.4.8) and only saw this problem before when it was possible for mechanize to fail to retrieve a page and the script to continue to try to parse it, now we have error detection for that kind of thing :). Seeing as you still had this issue in 0.11, I'd like to look into this more. Can you name a few songs you were trying to query that got this result? Do you need a proxy or anything to connect to the internet (though I expect that would have given you a connection failure message)?

You're right about the bug report system, sorry about that.

Report

PhobosK

11 years ago

No i do not use any kind of proxy or anything.
It happens practically with every song.
Here are some of them Title/Artist/Album:

01 - Dark Skies/Blutengel/Angel Dust - Ltd Edition
Adelante - Bonus/SASH/S4! Sash!
These Words/Natasha Bedingfield/Unwritten
05_ Why/DJ Sammy/The Rise

Report

PhobosK

11 years ago

Ok... It's a shame really... The problem is not in the code at all but in the "Scroogle sysops" (I will not use any qualifications about them because i cannot find a proper word - my dictionary is not so rude). What they do is blocking IPs that used their service via the script (this is only my guess). What I get from my IP as a result of the queery (LWP debug):
LWP::UserAgent::new: ()
LWP::UserAgent::request: ()
HTTP::Cookies::add_cookie_header: Checking www.scroogle.org for cookies
HTTP::Cookies::add_cookie_header: Checking .scroogle.org for cookies
HTTP::Cookies::add_cookie_header: Checking scroogle.org for cookies
HTTP::Cookies::add_cookie_header: Checking .org for cookies
LWP::UserAgent::send_request: GET http://www.scroogle.org/cgi-bin/scraper.htm
LWP::UserAgent::_need_proxy: Not proxied
LWP::Protocol::http::request: ()
LWP::Protocol::collect: read 18 bytes
LWP::UserAgent::request: Simple response: OK
Can't call method "value" on an undefined value at /usr/local/share/perl/5.8.8/WWW/Mechanize.pm line 1247, <STDIN> line 3.

is:

HTTP/1.1 200 OK
Date: Mon, 14 Jan 2008 22:15:56 GMT
Server: Apache/2.0.51 (Fedora)
Content-Length: 18
Connection: close
Content-Type: text/plain; charset=UTF-8

Server too busy.


So the script fails.
I run simultaneous (sync) queeries from 3 diff IPs to http://www.scroogle.org/cgi-bin/scraper.htm
and the one from my IP fails everytime....
+ The queery CGI script: http://www.scroogle.org/cgi-bin/nbbw.cgi?Gw=point+of+no+return
gives forbidden error....

I am speechless ...

Report

C

ultramancool

11 years ago

Oh really? the user agent switch in the latest version seems to be working for me still. If anyone else reports this issue I'll switch to ripping google, that's less evil anyways.

Report

DanielBrandt

11 years ago

Scroogle is blocking this script as of 2008-01-13. The only reason Google, Inc. barely tolerates Scroogle, which is an unauthorized scraper run by a nonprofit so that individual users can see Google results without compromising their privacy, is because I have made assurances to Google that I am actively blocking automated inquiries. Due to increased publicity for this script, I felt that it was time to block users who arrive at Scroogle through the use of this script.
-- Daniel Brandt, Scroogle sysop

Report

C

ultramancool

11 years ago

I'm sorry to hear this. Sorry to tell you, but, noone asked for your opinion. Well, it doesn't appear any block is in effect yet. Please read the bottom question on here, http://udmp.info/content/view/13/29/. In any case, the next release will spoof user agent to Windows IE6, so, I hope you like playing cat and mouse. I feel the queries themselves are fairly hard to identify as being automated in the new release too. Enjoy Scrooglyrics 0.11!

Report

DanielBrandt

11 years ago

Scroogle is blocking this script as of 2007-01-13. The only reason Google, Inc. barely tolerates Scroogle, which is an unauthorized scraper run by a nonprofit so that individual users can see Google results without compromising their privacy, is because I have made assurances to Google that I am actively blocking automated inquiries. Due to increased publicity for this script, I felt that it was time to block users who arrive at Scroogle through the use of this script.
-- Daniel Brandt, Scroogle sysop

Report

qurk

11 years ago

I have to say your script is very very good. Lyrc seems to have been down last couple days, and was frustrating copy pasting all the lyrics in. I tried your script like a month ago and it didn't work too well, but went to hotnewstuff and installed new version a little bit ago and wow. Every song is like instantaneous, even had a couple 10 second songs flash the lyrics before changing. I am impressed, you are doing a good job.

Report

morethanskindeep

11 years ago

First of all, thank you for this amazing script: it works very well and is better than anything I have ever seen! I am happy to have found it after continuing frustrations with others.


1) I have noticed one thing: not all lyrics on lyricwiki.org seem to be found by scroogle.
For example, I tried the query

"Otis Taylor" "Buy Myself Some Freedom" lyrics

giving only useless results, though the site http://lyricwiki.org/Otis_Taylor:Buy_Myself_Some_Freedom does exist. (Can that be because it was added only recently?)


2) Second of all, I was wondering what you think about adding support for local lyrics files -- first searching the file names in some local directory (see also http://kde-apps.org/content/show.php/Local+Lyrics?content=37981) and in case nothing is found asking scroogle.
I don't know enough to do that myself, but I can imagine that it is not overly difficult?


Thanks again... :)

Report

C

ultramancool

11 years ago

Thanks, I was in an awful mood and needed to know my work was appreciated.

1) Yeah, that's just because the page was recently created and is yet to hit google, not much I can do about that, sorry.

2) This seems like a pretty good idea, I'll see what I can do, maybe for 0.9 :)

Report

morethanskindeep

11 years ago

First of all, thank you for this amazing script: it works very well and is better than anything I have ever seen! I am happy to have found it after continuing frustrations with others.


1) I have noticed one thing: not all lyrics on lyricwiki.org seem to be found by scroogle.
For example, I tried the query

"Otis Taylor" "Buy Myself Some Freedom" lyrics

giving only useless results, though the site http://lyricwiki.org/Otis_Taylor:Buy_Myself_Some_Freedom does exist. (Can that be because it was added only recently?)


2) Second of all, I was wondering what you think about adding support for local lyrics files -- first searching the file names in some local directory (see also http://kde-apps.org/content/show.php/Local+Lyrics?content=37981) and in case nothing is found asking scroogle.
I don't know enough to do that myself, but I can imagine that it is not overly difficult?


Thanks again... :)

Report

v6lur

11 years ago

Please add Lyriki :)
(www.lyriki.com)

Report

C

ultramancool

11 years ago

Support for Lyriki will be avaliable next release.

Report

stifi

11 years ago

Thanks for the great script!

One thing is really annoying. If some lyrics could not be fetched the error message Failed to find any lyrics. Press refresh to try again. is cached as lyrics in Amarok. If you use the notification "<?xml version=\"1.0\" encoding=\"UTF-8\" ?> <suggestions page_url=\"your_url\"></suggestions>" Amarok will display an error in the Context Browser and will not cache any lyrics [1].

I already changed the source for my needs. But maybe changing it in the public version my help others.

[1] http://amarok.kde.org/wiki/Script-Writing_HowTo

Report

C

ultramancool

11 years ago

Thanks for the info! I was wondering how to do it earlier. This'll surely be in the next revision.

Report

mattepiu

11 years ago

python version with just 2 dependancies:
python-mechanize and stripogram

http://pastebin.ca/830909

Report

C

ultramancool

11 years ago

Wow, cool, nice work! I plan on making the next version include some of your improvements too, stay tuned, probably a big update coming within the next few days.

Report

mattepiu

11 years ago

There was some hackish code in the pytho9n script I posted above,
so I recoded better and this time I removed the stripogram dependancy
(just one dependancy: python mechanize).

P.S.: I had to decode from utf8 the
song365 lyrics as they were encoded
so, I don't know if perl version already does it....

http://pastebin.com/f19836eee

Report

C

ultramancool

11 years ago

Well, I've removed the dependencies on HTML::Strip and HTML::Entities in the latest release along with several other improvements you and neoeno have given me ideas/contributed code for, thanks! Hopefully the new version will be more friendly towards inexperienced users, but some ubuntu packages would probably be nice.

Report

mattepiu

11 years ago

Ops, got unnoticed in previous comment page so I'll repost...

1)DEPENDANCIES
Is Entities really needed?
I have issues with it, so I translated this script in python, and I'm using
.encode('ascii', 'xmlcharrefreplace') to get that functionality.

(well, I had to work a bit more cause python implementation of mechanize
is different)

Is there a way to reduce dependancies?
That would make more likely a substitution of the basic script
(which is the only ruby script I use and I'd like to wipe ruby completely)

P.S.: In both google and scroogle
you can restrict your search to domains:
before your query add
"site:www.azlyrics.com OR site:www.lyrics.org OR ..."

and you'll get results only from those domains...

Report

neoeno

11 years ago

I made a few additions and edits to your script so it would work better for me. I figured I'd give you a chance to leach from that if you want.

My changes as I remember them:
- Adding songmeanings support
- Adding wearethelyrics support (I basically just added whatever site it took to make the script pick up my lyrics)
- I did add sing365, but you added that in the latest release anyway, so I let yours take precedence.
- Altering the search string slightly, I added 'intitle:' before the title of the song, might want to do that with the artist too. The idea was that pretty much all lyrics sites have the songTitle in the title, so it helps to weed out useless results.

A suggestion:
For a while I added the sites available to the search string (e.g. 'site:songmeanings.net OR site:sing365.com ...'), but I took it out because it was making the search query rather long.. and sometimes it would cause non-lyrics pages of the site to come up first. A bit more investigation into this might optimise the script considerably. Perhaps multiple scroogle queries would be helpful?

Anyway, here's my current script:

http://pastebin.ca/829747 (perl syntax highlighting seems to be broke'd on that site)

Report

C

ultramancool

11 years ago

Glad to see others taking some interest in this little project of mine! You've got some really interesting modifications here and since I can see your code is already based off 0.5.1, would you mind if I release it as a 0.6? As far as the site: OR site: thing goes, we might be better off doing something like inurl:lyricssite:com/lyrics so that we just get lyrics pages and not other garbage pages off their site.

Report

neoeno

11 years ago

No worries, do what you like with it :) I'll keep you posted if I add any other sites to it.

Ah, yes, using inurl would probably be better.

You considered talking to the Amarok devs about this? Seems like with a bit of expansion it would be a shoe-in for the default script, since it's a meta-search and it's not hopelessly slow like the other one...

Report

11 years ago

0.11:
- Major rewrite
- Now easier than ever to add lyrics sites or disable them in the source (options screen coming soon)
- enhanced whitespace cleaning.
- overall cleaner code
- Now spoofs user agents to IE6 on Windows to work around any weak blocking attempts.

0.10:
- Multiple search queries to find song, starts most accurate, works down to least (this should remove errors where an incorrect page is ripped)
- Added support for themadmusicarchive.com
- Updated sing365 regex

0.9:
- Supports lyriki.com
- Supports lyricspy
- Now supports local lyrics!
To use local lyrics, make a folder called "lyrics" in your home directory and store files named either as Title.txt or as Artist - Title.txt
Note that these are case sensitive, so be careful before you say it doesn't work.

0.8:
- Several major changes UPGRADE STRONGLY RECOMMENDED!
- Script will now fail properly if no lyrics are found
- Now fails properly when there's no connection
- Fixed several regex's which were not working properly
- Whitespace problems are now solved for several sites

0.7.4:
- No more crashing on connection problems
- Now removes parts of title in parenthesis.
- Added in README and COPYING for about dialog

0.7.3:
- Fixed bug in artists or song name containing special characters.

0.7.2:
- Fixed bug where artist name is not sent in search request

0.7.1:
- Added some extra debugging code
- Fixed lyrics007 linebreak bug
- Now removes starting "The" in artist names, seems to get better results

0.7:
- Removed depenencies on HTML::Entities and HTML::Strip, just needs Mechanize now.

0.6:
- BIG Update!
- Now searches songmeanings.net and wearethelyrics.com
- now only googles for results in sites needed
- now will keep looking if a regex fails instead of dying
- big thanks to mattepiu and neoeno for their work towards this version, great job!
- Stay tuned for a version that will include modules.

0.5.1:
- Fixed capturing for actionext.com and azlyrics.com
- Added a little extra debugging, if anyone's having problems, pull up the output window for the script and see if a regex has failed.

0.5:
- Fixed packaging error

0.4:
- Added letssingit.com and lyricwiki
- code cleanup, now passes use strict
0.3:
- Bug fix :) No more last-lyrics bug. I hope.

0.2:
- Added sign365.com support. I'll be working on that last-lyrics bug for next release.

0.1:
Initial release, currently has support for 3 lyrics search websites, azlyrics.com, lyrics007.com and actionext.com. This will probably be expanded later, but, I easily find most of my more obscure songs with just these 3! This my friends is why we need a lyrics metasearch.

File (click to download) Version Description Downloads Date Filesize DL OCS-Install
Pling
Details
license
version
0.11
updated Jan 18 2008
added Dec 03 2007
downloads today
0
page views today 5
System Tags addon