segfaults and libjpeg.so.62 missing after update

I updated my Gentoo system last week (amd64) and ran into multiple problems.

Due to compile errors on many ebuilds I decided to run a revdep-rebuild. After about an hour it recognized that about 380 ebuilds were broken and had to be remerged. I began but could not finish that day so I left revdep-rebuild (which happened to stop due to an error anyway) and shut my system down.

The next day I realized that about half my system was broken. CUPS crashed repeatedly, reporting a segfault with libc whenever I tried to print anything (visible in the kernel log). Remerging cups or glibc did not help, so I went to the Gentoo forums and found the advise to run

emerge -e system 
etc-update 
perl-cleaner all 
python-updater 
emerge -1 libtool 
revdep-rebuild

It took a long time involving the usual stop-and-go caused by build errors but finally it finished. CUPS worked again and so did most of the system. Unfortunately 64 bit binaries like nxclient still did not run because libjpeg.so.62 did not exist any longer. Well, it did for 32 bit apps (so the 32 bit Firefox binary could still run) but not for 64 bit. (the 32 bit version is belongs to app-emulation/emul-linux-x86-baselibs) As usual, linking to the 32 bit library was without success (wrong ELF architecture) and linking to libjpeg.so.7 didn’t work well (I was able to run NX but all JPEGs were missing).

I was lucky enough to find a rather unrelated bug report on the bugtracker that indicates there is a second ebuild for libjpeg called media-libs/jpeg-compat which actually contains the missing 64 bit version of libjpeg.so.62. Having emerged that everything works fine again now.

Opera 10: finally I got Flash and Java back

I don’t know what they did, but Opera 10 Beta 3 (build 4537) finally got Flash working again on my Linux system. Back in 2007 Flash 9 started to use GTK solely and Opera 10 could no longer run the plugin: Opera would simply freeze and/or crash. Due to security updates I finally had to upgrade the Flash player and had to start Firefox for every flash website (or Flash video) I wanted to view. With every update I checked if support returned but it still would not run. In 2008 I changed to 64 bit and while many people reported it started working again on their 32 bit systems I still only got crashes. In February I started using KDE 4 (and Qt 4) and while people started reporting Flash to work on 64 bit I – again – could not use it.

Maybe I was always upgrading/updating at the wrong time – there is one thing I can read from the filenames portage downloaded:

opera-10.00-b1.gcc4-shared-qt3.x86_64.tar.bz2
opera-10.00-b3.gcc4-qt4.x86_64.tar.bz2

It seems like older versions have only been built (or downloaded) with Qt3 libraries but with beta 3 there finally is a Qt4 version available. Maybe that’s the fix – finally, after 2 years, I can use Flash again in my primary browser! 😀

I checked some websites and while all Flash movies run and most run fine, videos still run jerky, highly depending on the players being used and the mode (fullscreen is worst). While these performance problems happen to be worse than in Firefox, it’s still mainly a problem of the Linux version of the Flash player (xkcd made a cartoon about it recently).

The next thing I could not use since I switched to 64 bit was the Java plugin. It was the same as with Flash, I got freezes and/or crashes and no useful error messages. I was quite surprised when I saw a useful backtrace on my terminal: For some reason, Opera 10 tried to use Blackdown JDK 1.4 which some ebuild must have had as a dependency. I unmerged it and now I got errors complaining about libjvm.so not being found. To point Opera to the correct path simply go into Tools/Preferences/Advanced/Content, enable Java and open the Java Options dialog. The path to be entered can be found by running locate libjava.so and should be something like /opt/sun-jdk-1.6.0.15/jre/lib/amd64/ (excluding the filename; you will need to update this on version changes).

On Gentoo you may still get an error message telling you that libjvm.so cannot be found. You may need to symlink it from …/jre/lib/amd64/server/ to …/jre/lib/amd64/ and it should be working after a browser restart.

I don’t know since when there was a Java Options dialog, so maybe I could have been able to use Java for quite some time now. Anyway, it’s great to finally have both plugins working again.

Amarok 2 with Gentoo on AMD64

Finally, that’s possible. I will explain what’s necessary in case it’s not yet in portage (just saw the patches are making it to overlays now 😀 ) but not without a warning and explanation: Amarok 2 has been blocked by MySQL not compiling correctly for use as a shared library. The main bug report on Gentoo’s bugtracker is here for MySQL. Although it now seems to have made its way into mysql-extras overlay (?) it may still take a while to be verified for not causing any problems at all.

DO NOT TRY THE FOLLOWING STEPS ON A PRODUCTIVE SYSTEM!

Read all instructions carefully. I’m not responsible for any data loss or corruption you may experience by following this howto and using the patched MySQL or beta release of Amarok.

You may want to backup your databases before you go into patching. You have been warned (although everything seems to be fine on my system).

Patching MySQL
When I got it working yesterday, I still patched the eclass file to add a USE-flag “pic” for triggering a GCC option. The current status from the bug comments is that it should be a false solution and not be necessary anymore. However, I still get a linker error “recompile with -fPIC” when compiling Amarok 2 afterwards. So you still need to patch the eclass file using the files stroken out on the bug report (mysql.eclass and a patch to it, apply the patch or grab the resulting file here). You may want to try if it works for you without the patched eclass, though. If you choose to apply the patched eclass, you need to put it in your overlay into /eclass directory (top-level like a category). On the next emerge you will be warned to add “metadata-transfer” to your FEATURES variable in /etc/make.conf and run emerge –regen after each sync (this will take a while; ~20 minutes on my 3GHz Core2Duo).

Now we need an ebuild, too. Download the patch file to MySQL’s source code (better check if it’s still current) and store it in your local overlay’s dev-db/mysql/files. Copy the latest ebuild for MySQL from official portage to your overlay and rename it to match version numbers with the patch. Run ebuild mysql-whateverversion.ebuild digest, unmask and emerge it! (using “embedded” USE-flag)

Amarok 2.1 Beta 1
If we go unstable, we do it right. So we are going to create an ebuild for Amarok 2.1 Beta 1, too. Again, BE WARNED: There are some possible corruption issues with ID3 tags (also reported for 2.0.1). You don’t want that to happen, so be careful not to do any writing operations to your files in Amarok until that bug gets fixed. (I got a couple more with reading tags but my files are still intact, so I’m fine; all bugs are already in the bug tracker)

Amarok 2.1 depends on taglib-extras, which is currently not in portage. So simply go to http://gpo.zugaina.org/media-libs/taglib-extras and grab 0.1.2’s ebuild, put it in overlay, digest, unmask and emerge.

For Amarok itself, Beta 1 seems to be version 2.0.90. Simply grab my ebuild, which is a slightly modified version from official portage tree. I removed the iPod patch and the webkit string replacement, added 2 dependencies and had to fake the check for Qt script bindings. Of course, I needed to replace the download URI and add ~amd64 keyword:

--- /usr/portage/media-sound/amarok/amarok-2.0.1.1.ebuild       2009-03-16 11:36:10.000000000 +0100
+++ amarok-2.0.90.ebuild        2009-04-12 17:53:18.000000000 +0200
@@ -14,10 +14,11 @@
HOMEPAGE="http://amarok.kde.org/"
LICENSE="GPL-2"
-KEYWORDS="~x86"
+KEYWORDS="~x86 ~amd64"
SLOT="2"
IUSE="daap debug ifp ipod mp3tunes mp4 mtp njb +semantic-desktop"
-SRC_URI="mirror://kde/stable/${PN}/${PV}/src/${P}.tar.bz2"
+#SRC_URI="mirror://kde/stable/${PN}/${PV}/src/${P}.tar.bz2"
+SRC_URI="ftp://ftp.kde.org/pub/kde/unstable/${PN}/${PV}/src/${P}.tar.bz2"

DEPEND=">=app-misc/strigi-0.5.7
|| (
@@ -25,6 +26,8 @@
>=dev-db/mysql-community-5.0[embedded,-minimal]
)
>=media-libs/taglib-1.5
+       >=media-libs/taglib-extras-0.1
+       >=x11-libs/qt-script-4.4.2
|| ( media-sound/phonon x11-libs/qt-phonon:4 )
>=kde-base/kdelibs-${KDE_MINIMAL}[opengl?,semantic-desktop?]
>=kde-base/plasma-workspace-${KDE_MINIMAL}
@@ -46,7 +49,8 @@
app-arch/unzip
daap? ( www-servers/mongrel )"

-PATCHES=( "${FILESDIR}/${PV}-ipod.patch" )
+# PATCHES=( "${FILESDIR}/${PV}-ipod.patch" )
+PATCHES=( )

pkg_setup() {
if use amd64 ; then
@@ -77,9 +81,10 @@
fi

# Remove superfluous QT_WEBKIT
-       sed -e 's/ -DQT_WEBKIT//g' \
-               -i "${S}"/src/scriptengine/generator/generator/CMakeLists.txt \
-               || die "Removing unnecessary -DQT_WEBKIT failed."
+#      sed -e 's/ -DQT_WEBKIT//g' \
+#              -i "${S}"/src/scriptengine/generator/generator/CMakeLists.txt \
+#              || die "Removing unnecessary -DQT_WEBKIT failed."
+       sed -e 's/CHECK_CXX_SOURCE_RUNS/set( BINDINGS_RUN_RESULT 1 )\n#/g' -i "${S}"/cmake/modules/FindQtScriptQtBindings.cmake

mycmakeargs="${mycmakeargs}
$(cmake-utils_use_with ipod Ipod)

You may need to uninstall Amarok 1 before emerging 2 or you will get file collisions.

Some problems
I have 2 soundcards and Amarok was playing everything on the wrong one on first start-up although I reordered my cards in the configuration dialog. I seem to have gotten rid of that problem by simply restarting Amarok after changing the order. GStreamer backend currently crashes, so you may prefer xine (which I do nevertheless).

I couldn’t get Ampache working yet, but I assume the fault is on Ampache’s side. We were able to track the problem down, likely being the hash being used on authorization which seems to be an issue between 32 bit servers and 64 bit clients (seems to be a different timestamp; I really don’t know why that’s still a problem, I guess it’s very bad programming or at least testing…).

Amarok doesn’t index all files correctly. Some files are not shown with metadata but only with their filenames instead. You may be able to work around it by rescanning your collection multiple times. There also seem to be some index issues on the collection when updating the collection while browsing through it. However, I still got some MP3s not showing up in my collection although they seem to be counted.

Oh, and you scrobble streams to Last.fm which I consider a bug since that’s not the intended behaviour (or at least has not been until now).

I will try to confirm these bugs and report them if they are not already listed in KDE’s bugtracker.

Some scripts currently don’t work. That’s because we faked around the script bindings dependency in order to be able to compile. You may get some ebuild to compile qtscriptgenerator, but it depends on Qt 4.5 which I haven’t yet installed (yes, I know I blogged about the recommendation to use it one post earlier, but I haven’t had time to recompile it yet – I should do that next). Do not report errors about scripts not running to Amarok devs; it’s entirely our own fault. However, if you need those scripts, you are free to get into compiling that dependency as well. 😉

Make it prettier
Amarok’s default theme doesn’t look appealing, at least not when using KDE’s Oxygen theme with default colors. You may have a look at alternative Amarok themes on kde-look.org. Most themes inherit their colors from your color theme, so don’t be surprised if it looks different than on the screenshots. I went for “Amarok Highlights” for the time being.

Unfortunately, Amarok 2 is still lacking a theme manager, so you have to install the themes manually by copying them to ~/.kde4/share/apps/amarok/images/ (one at a time; stylesheet.css goes one level above). You should quit Amarok before changing themes and clean caches before restarting it:

rm ~/.kde4/cache-YourHostname/kpc/Amarok-pixmaps.index
rm ~/.kde4/cache-YourHostname/kpc/Amarok-pixmaps.data

Gentoo and KDE4

Everyone wanting to upgrade to KDE4 with Gentoo should be adviced not to trust the official guide completely. It’s a good start but incomplete and misleading. You should have a look at it nevertheless:

http://www.gentoo.org/proj/en/desktop/kde/kde4-guide.xml

You should know the following facts before getting yourself into trouble:

  • You need KDE 3.5.10 (unstable) prior to KDE 4 if you plan to use both along each other. Some applications of KDE 3.5.10 will break nevertheless, at least if you install without kdeprefix (see below). You will be blocked otherwise.
  • You don’t need an unstable portage 2.2 for EABI 2. Well, you kind of need it nevertheless. portage 2.1.6.* IS portage 2.2 but with all new features (except EABI 2) disabled. If you have enough time, know Python and want to help the portage developers please do it, you know where you find them. If you run a more experimental system (you are unmasking KDE 4, right?), you could unmask 2.2 nevertheless and help Gentoo by testing and submitting bug reports.
  • You need to unmask lots and lots of packages. You can use this file and NOT the file you get there (both are linked in the KDE4 guide but the last one is for kde-testing overlay only). Unfortunately you will need to unmask even more. You can use unmasker, a nice tiny tool that does all that nasty unmasking stuff for you while you can do better things. Following the comments you will end up with a directory /etc/portage.keywords/ and some files in it. Emerge will use all of them. Have a look at the generated file(s) and modify/comment/delete any lines you don’t like. There will be some (e.g. SVN versions).
  • Before starting that huge 250 ebuild session (twice that size if you need KDE 3.5.10), make sure you have kdeprefix in your USE flags! The guide says it would not be necessary to use it but trust me: it is if you want to save yourself some headaches.
  • Having compiled all ebuilds and noticing random crashes and disfunctionality? Start dbus and hald first – you need them in KDE4 or you get weird problems. (that’s not in the guide either 🙁 )
  • Using unstable xorg 1.5 and have no keyboard anymore? Or are you just wondering why you should use 1.5? See this short thread.
  • You use NVidia drivers and see flickering red or black checkerboards covering your videos? Well, guess you are using Amarok. It took almost a year to find that out. Just exit Amarok or open and close the playlist and control window a few times, the flickering will disappear. BTW that also happened with Compiz Fusion and KDE 3.5.
  • Having stability problems? Make sure you use nvidia-drivers 180.27 or 180.29 for the time being, do NOT use 180.37 for now.
  • Your icons disappeared? Check for the Inherits=hicolor setting to be correct (find the icon set configs and possible options for Inherits using locate index.theme), also check for deprecated icons (folder icons etc.) and switch them. Maybe it also helps to upgrade to QT 4.5 and of course x.org 1.5 if you didn’t do that already; this could also fix some weird stuff going nuts in your system tray. Still missing icons? Bad luck I guess. The Oxygen icon set still seems incomplete and Inherits doesn’t seem to work in all cases, so you may try switching to some other icons instead. (Although this is 4.2, isn’t it? But what did other distributions do to fix it if it’s really incomplete/broken?)

These won’t be the only problems you come across, and I wrote some of them from memory because I upgraded about a month ago. Maybe it’s not sufficient what I wrote but it should save you at least 5 hours of work (or even more) figuring everything out yourself.

Good luck!

Stack Overflow

Stack Overflow is a community-driven website where you can ask any programming related questions and answer other people’s questions. Based on a good reputation system you start being restricted to only post questions and answers. By getting “up” votes on your posts you will gain reputation points and more permissions like voting other people’s posts and commenting them. With a rather large amount of reputation points you will even get moderator permissions for the platform.

Registration is quite easy. If you have any account supporting OpenID (such as Yahoo/Flickr) you will be able to login right away. Avatars are being loaded from Gravatar. Before asking questions you should try searching for earlier posts on that topic; one idea behind Stack Overflow is to build a large FAQ of programming questions. I’ve recently got Stack Overflow threads in Google search results as well (and in most cases they immediately solved my problem).

The speed of questions and answers is rather high paced. Many people are monitoring the list of newest questions and respond in about 3-30 minutes. It’s not uncommon to be overtaken while still writing on an answer – in these cases you will get a notification on top of the site which you can click to see the latest answers while still retaining your unsent answer. Voting is quick, too. But in almost any case quality matters and since there will be some extra points for accepted answers you are never wrong to submit your own one if it adds some new aspects, even to highly frequented threads.

It’s best to try it yourself but be warned: That site is addictive! 🙂

Browser warning for IE6 users

Since the will to do all that annoying extra bugfixes for IE6 is pretty low 7 1/2 years after its release, I finally decided to display a little nag notice on top of my page after having been encouraged to do so by others (see below). The script is multi-language and customizable and you are free to use, distribute and modify it for your websites (even commercial ones) without any further agreement (consider this a license; needed in Germany). It’s multi-language and offers some small options for quick setup.

Please note this does not mean this site (and others I create in my freetime or at work) will completely stop working in IE6 in near future but some functionality or design might be incomplete/broken because I will no longer optimize my personal web sites for IE6. (At work we and our customers cannot afford stopping to optimize for IE6 yet since some companies still won’t upgrade in near future for – in most cases – comprehensable reasons, so either our customers or theirs would not be able to use the websites we build.) Nevertheless it won’t hurt to display a relatively unobtrusive message.

The icon I used for it is from famfamfam’s free mini icon set: http://www.famfamfam.com/lab/icons/mini/

Other sites that encourage their users to upgrade: (not using my script)

Get a preview here: (Note: this may not display correctly in browsers other than IE 6 – no sense to support it elsewhere 😉 )

English
German

JavaScript and CSS (quick setup options on top; language and download arrays inside warn-function)

You will also need jQuery in case you don’t have it already. (include jQuery and CSS before including the nag-script)

Broken

In case anyone should be wondering why I didn’t continue my project to install Gentoo: Well… Unfortunately that slug is broken.

USB seems to be damaged in some way as it reports nothing but rubbish to the kernel and something causes bad noises to come from the beeper. Some forum threads pointed out that might be caused by a bad power supply. However, after changing the AC adapter it doesn’t run any better in my case. Since I wasn’t able to flash the original firmware correctly (it was just too obvious that I manipulated that box) I decided to open it up and take a look at its PCB.

Unfortunately I can’t see any physical damages on that board so I must assume there’s some hidden defect somewhere. I doubt I can fix it, so I must consider that box to be broken after just 4 months. After all I read I must advice everyone NOT to buy a NSLU2 – except you are out for some soldering on the board. The hardware fails much too often from what I could find, be it for overheating, manufacturing or just design errors (like the fancy “feature” that you can power the box solely by an external USB hub). Sadly, if I cannot find the error or a workaround for it, I may not continue on this project since I am not willing to buy another box for full price that’s broken by design.

Either I buy one very very cheap at eBay or I will just let it die. Maybe someday I might be able to fix it – could be my studies might be of some help at some later point.

Anyway; this project was no complete loss of time. Much of the knowledge I gained can be helpful in getting Gentoo onto the Pandora when it finally arrives (which unfortunately may take another 3-4 months depending on LCD production since I assume I’m somewhere in the last third of the preorder queue).

msleep() vs. mdelay()

Arrrrgh…. I finally got it….

For months I have been trying to figure out why my cross-compiled kernel and rootfs won’t boot on the NSLU2 aka Slug. Since I didn’t want to solder pins for the serial I/O onto the board I tried a quite unique approach: Since the Slug has 4 LEDs that can be easily controlled through GPIO, I could hook up at some function and send all output through the LEDs. Sure this would be slow (~1 character per second, maybe 1.5) but that would be enough to at least read kernel panic messages while retaining full warranty on the hardware.

Finding out how LEDs were controlled took me about 1,5 hours, hooking up and testing first things another hour. I hooked up before the call of init (disk 2 on), right after init (disk 1 on) and on kernel panics (both disk leds on). So what happened: It got init and then crashed with a panic after a few seconds. I figured out I could hook up at uart_console_write() or panic() and then read any output by blinking a byte in 4 steps (one byte on each disk LED, signal indication by setting power to green and “clock” indication through power amber). Well it started blinking for hours and hours and hours… But all I could decode was just infinite rubbish, no matter what I tried. Even a comparison iThat was the critical fault.

msleep() seems to suspend the currently running task, so it is non-blocking regarding the whole system.

mdelay() blocks the system (or at least the active CPU) if running single-threaded.

So why was that small change critical to my code? I don’t know exactly. But I know that panic() disables scheduling before any further action. So what happens if some code fragment used by panic() tries to relay on scheduling? Something seems to get corrupted very seriously, maybe some kind of heap or stack overflow happens. Maybe some process/scheduler data gets screwed up. I don’t know. But that seemingly tiny difference of blocking vs. non-blocking functions (what function does what isn’t always that clear if you’re new to Linux kernel programming) really makes a very big difference.

I finally recorded my kernel panic on dv tape and will decode it tomorrow using a simple tool I wrote. If it is finished I will make it available from this website.

Here’s a small excerpt from the ~20 minutes long message transmitting “BUG: sched[…]” (I did not decode more than that yet):

Edit: I decoded all 20 minutes. What was readable (the decoder was a quick and dirty solution since yet) led me to “BUG: scheduling while atomic:” in kernel source and was simply caused by a remaining msleep in my LED function. I got rid of it and now got a clean message “Attempted to kill init!”. Now, that’s where the debugging begins…

Slug (Linksys NSLU2)

After I read about a quite inexpensive (about 65 to 75€) embedded system on a forum two weeks ago, I needed to get one of these myself. The system has two USB host ports and an ethernet interface, 32MB SD-RAM, 8MB flash memory and a 266MHz ARM (Intel XScale) CPU (underclocked @133 MHz until mid 2006 production dates). It’s running Linux with a modified RedBoot bootloader. It’s originally intended to be a NAS server for USB hard drives but can be flashed with different Linux kernels and images. Unfortunately it’s already getting old (first released in 2004) and was reported to be discontinued so I had to decide to buy it now or never. I bought it:

If that device is completely new to you, the article on Wikipedia (en/de) may provide a good starting point for more information on what is possible. If you get interested in it, nslu2-linux.org provides a great resource to answer almost all your questions.

My goal is to get Asterisk, DHCP, DNS, OpenVPN and maybe a small webserver to run on it. However I haven’t reached that yet. (Click the link below to read more.) Continue reading “Slug (Linksys NSLU2)”

Java JAX-WS tutorial and standalone

If you’re wondering why I post this in English: I felt the urge to post in a way so Google & Co. can find it on the keywords I was unable to get any helpful results for.

In a homework we have to use JAX WS to implement a sample SOAP web service. I don’t understand why we were not simply allowed to continue using Axis (well, they said setting it up on server-side would be too complex – but if you’re just deploying the WAR archive and use the JWS approach without complex types/classes that would have been enough for such an exercise). It’s not only demanding you to use Java 1.6 (which Apple doesn’t think is worth providing a 32 Bit JRE/JDK for; so I can’t use my iBook to work on it) but compared to Axis I found JAX extremely bad documented (or their web site too confusing). I was unable to find the information I was looking for and got stuck on pages telling me how generated files look like after they have been generated. For some reason they just missed to give a fast introduction like “Getting started on implementing a web service” to give a short overview of what’s necessary just to get your workspace and environment up and running.

I thought I would need to setup Tomcat 6, compile, wsgen, write lots of configuration files, WAR everything and then deploy it. I was wrong. After hours of unsuccessful attempts and searches I remembered I saw something about JAX WS on IBM DeveloperWorks some time ago. I headed for their website and there it was: Design and develop JAX-WS 2.0 Web services describes everything you need to know to get started with JAX. It even shows a one-liner to start the web service standalone – without the hassle to deploy it. Just read until at least section 4 (page 8 on PDF) and have a look into the example code, class OrderWebServicePublisher. You’ll be surprised.

BTW: In case you are using Gentoo you may notice some JDK tools are not yet linked to /usr/bin (e.g. wsgen :P). You may want to have a look into /opt/sun-jdk-…/bin. Doesn’t affect the Ant task though.