Wednesday, November 3, 2010

~/bin: apod-set-bg

Astronomy Picture Of the Day shows a new, usually desktop sized, picture every day. apod-set-bg is a shell script to download the daily image and set it as the desktop background. It also downloads the explanation as a separate html file.

Requirements: curl, tidy, xmlstarlet, feh.

#!/bin/sh

IMG_DIR="$1"

APOD_URL=http://apod.nasa.gov/apod/ 

URL_CONTENT=`curl $APOD_URL | tidy -q -asxml`

IMG_URL=$APOD_URL`echo "$URL_CONTENT" | xml sel -N html='http://www.w3.org/1999/xhtml' -t -v '//html:img/ancestor::html:a/attribute::href' -`

EXPLANATION=`echo "$URL_CONTENT" | xml sel -N html='http://www.w3.org/1999/xhtml' -t -c '/html:html/html:body/html:center[2]' -c '/html:html/html:body/html:p' -`

FILE_NAME=`basename "$IMG_URL"`

FILE_PATH="$IMG_DIR/$FILE_NAME"

curl -o "$FILE_PATH" "$IMG_URL" || exit 1

echo "$EXPLANATION" > "$FILE_PATH.html"

ln -sf "$FILE_PATH" "$IMG_DIR/latest.jpg"
ln -sf "$FILE_PATH.html" "$IMG_DIR/latest.html"

feh --bg-fill "$FILE_PATH"

Put the following in your crontab to run daily.

@daily ID=apod DISPLAY=:0.0 ~/bin/apod-set-bg ~/bg/ >/dev/null 2>&1

Thursday, September 16, 2010

tvnz-grab in factor

The previous post in factor. I wrote it so my Windows using friends could have a single-binary solution. Download the zip archive and unzip somewhere in the windows path; C:\windows\system32 will do.

Usage: tvnz-grab <episode-page-url>. It will download all parts of the episode into the current directory.

Unfortunately, this only works for NZ content. Foreign content uses Adobe’s encrypted RTMP protocol. To get at episodes using that, check out rtmpsuck.

Imports first.

! Copyright (C) 2010 Jeremy Hughes.
! See http://factorcode.org/license.txt for BSD license.
USING: accessors assocs combinators.short-circuit fry
http.client io.streams.byte-array kernel namespaces make
regexp sequences xml xml.data locals splitting strings
io.encodings.binary io io.files command-line http system
math.parser destructors math math.functions io.pathnames
continuations xml.traversal ;
IN: tvnz-grab

Because I intend to extend this program into a small Qt demo, it is necessary that any words used to display UI information, dispatch on the type of UI.

SYMBOL: ui
SINGLETON: text

And the display methods themselves…

HOOK: show-progress ui ( chunk full -- )
HOOK: show-begin-fetch ui ( url -- )
HOOK: show-end-fetch ui ( -- )
HOOK: show-page-fetch ui ( -- )
HOOK: show-playlist ui ( seq -- )
HOOK: show-fatal-error ui ( error -- )

M\ text show-progress uses the symbols bytes and count to keep track of the number of bytes downloaded and the proportion of progress-bar fill respectively.

: print-bar ( full chunk -- )
    count [
        [ swap / 50 * round ] dip [
            - CHAR: =
             >string write
        ] [ drop ] 2bi
    ] change ;

M: text show-progress
    swap bytes [ + [ print-bar ] keep ] change flush ;

M: text show-begin-fetch
    "Fetching " write print "[" write flush ;

M: text show-end-fetch
    "]" print flush ;

M: text show-page-fetch
    "Fetching TVNZ page..." print flush ;

M: text show-playlist
    length "Found " write number>string write " parts." print
    flush ;

M: text show-fatal-error
    dup string? [ print ]
    [ drop "Oops! Something went wrong." print ] if 1 exit ;

Failed HTTP request errors need to be wrapped in a user friendly explanation.

: wrap-failed-request ( err -- * )
    [
        "HTTP request failed: " % [ message>> % ]
        [ " (" % code>> number>string % ")" % ] bi
    ] "" make throw ;

The playlist parameter in each episode’s web page is in a section of code looking like this.

var videoVars = {
    playlist: '/content/3685181/ta_ent_smil_skin.smil',
    config: '/fmsconfig.xml',
    ord: ord
};

Given this code could change unpredictably, we’ll use nothing more robust than a regular expression to get at the playlist path.

: get-playlist ( url -- data )
    http-get [ check-response drop ]
    [ R/ (?<=playlist: ').*(?=')/ first-match ] bi* [
        "http://tvnz.co.nz" prepend http-get [
            [ check-response drop ]
            [ wrap-failed-request ] recover
        ] dip
    ] [ "Could not find playlist at address." throw ] if* ;

The playlist is an XML file of which only the video elements concern us.

<video src="path-to-segment.flv" systemBitrate="700000"/>

700000 appears to be the highest bit rate so that is what we’ll go for.

parse-playlist takes the output of get-playlist and returns a list of URLs to episode segments.

: parse-playlist ( data -- urls )
    bytes>xml body>> "video" "700000" "systemBitrate"
    deep-tags-named-with-attr
    [ [ drop "src" ] [ attrs>> ] bi at ] map [ ] filter ;

Each segment is downloaded in chunks.

: call-progress ( data -- )
    length response get check-response
    "content-length" header string>number show-progress ;

: process-chunk ( data stream -- )
    [ stream-write ] [ drop call-progress ] 2bi ;

: get-video-segment ( url -- )
    [ show-begin-fetch ] [ ]
    [ part-name binary  ] tri
    [ '[ _ process-chunk ] with-http-get drop flush ]
    with-disposal show-end-fetch ;

: get-video-segments ( urls -- )
    [ get-video-segment ] each ;

grab-episode is where the action starts.

: (grab-episode) ( url -- )
    show-page-fetch get-playlist parse-playlist dup
    show-playlist [
        0 bytes count [ set ] bi-curry@ bi get-video-segments
    ] with-scope ;

: grab-episode ( url -- )
    [ (grab-episode) ] [ nip show-fatal-error ] recover ;

For the complete program see my github.

Saving TVNZ “ondemand” episodes

The code in my previous post will work for episodes from TVNZ’s ondemand service; but there is a more direct approach.

Each episode page stores a URL to the episode’s playlist in a javascript property "playlist". The URL points to an XML file containing the URLs of four or more video parts and no ads. The following script downloads the episode page, grabs the playlist URL, downloads the playlist, and parses it. It returns a newline separated list of video URLs.

You will need to have curl and xmlstarlet installed.

#!/bin/sh

if [ "$1" == "" ]; then
    echo "usage: $0 episode-url"
    exit 1
fi

EPISODE="$1"

PLAYLIST=http://tvnz.co.nz`curl "$EPISODE" | grep -Po "(?<=playlist: ').*?(?=')"`

curl "$PLAYLIST" | \
    xml sel -N smil=http://www.w3.org/ns/SMIL \
        -t -m "//smil:video[@systemBitrate='700000']" -v '@src' -n -

I have saved it as tvnz-grab and call it like so:

tvnz-grab <episode-page-url> | xargs curl -O

The four or five parts can be played as a single file using mplayer’s "-fixed-vo" option.

Saving flash video on linux

On linux, the flash-player caches video in files under /tmp. Each video gets a file called /tmp/Flash[random-string]. These are easy enough to find and rename to [meaningful-string].flv. However, some players delete these files on completion, which is a pain given they are the files containing the video we want to save.

Hard linking these files solves the problem. The Flashxxx files will be deleted by the player, but their contents will sill be available at whatever location we made the hard link.

Some players split their content into multiple files, so we want this hard linking process to be automatic.

I have the following script saved as lnflv in my ~/bin.

#!/bin/sh

cd /tmp
for x in Flash*; do
    count=`find . -xdev -samefile "$x" | wc -l`
    if [ $count -gt 1 ]; then
        echo "Not linking $x. Already linked."
    else
        echo "Linking $x."
        ln "$x" "dl-$x.flv"
    fi
done

Run it every thirty seconds like this: watch -n30 lnflv. Each flash file not already hard linked will be linked to /tmp/dl-Flashxxx.flv.

Order by time and rename video parts with something like this:

alpha=a
for f in `find . -size +10M -name dl-Flash\* | xargs ls -tr`; do
    cp "$f" [name]$alpha.flv
    alpha=`echo -n $alpha | tr 'a-y' 'b-z'`
done

Change [name] to the name of whatever you are watching. "-size +10M" filters out files smaller than 10MB; this screens out short ads and is only required for video streams that embed them.

Friday, August 27, 2010

Old blog: Map in Java

(Found in my years old defunct blog. Less relevant now that Java is getting lambdas.)

Scheme
(map fn lst ...)

;;  ==> list
      

map is a higher order function that applies the function fn of n arguments to each element in the lists lst ...

Java has no map. Enhanced for isn't adequate for the following reasons:

  • No return value: Since for is a control structure, not a function, it doesn't return anything useful. This means a data structure has to be created beforehand, and added to at each iteration of the for loop.
  • No function arguments: This is not for's problem. Functions (methods) are not first class in Java.
  • One iterable only: The enhanced for loop can only iterate over one Iterable data structure. map can operate on any number of lists.

Examples

Simple one list mapping

Scheme
(define squares (list 1 4 9 16))

(define (sqrts nums)
  (map sqrt nums))

(sqrts squares)

;;  ==> (1 2 3 4)
      

Java
/* No generics for clarity.  Assume also the existance of a static
   method `toList' that takes an Array and returns an ArrayList. */

List squares = toList(new Integer[] {1, 4, 9, 16});

List sqrts(List nums) {
    List out = new ArrayList();
    for (Integer i : nums)
        out.add(Math.sqrt(i));
    return out;
}

sqrts(squares);

//  ==> [1, 2, 3, 4]
      
Multiple list mapping

Scheme
(define even (list 2 4 6 8))
(define odd (list 1 3 5 7))

(define (sums . lists)
  (apply map + lists))

(sums even odd)

;;  ==> (3 7 11 15)
      

Java
/* Assume the existance of a static method `isSameLength' that tests
   whether all Lists in an array are the same length. */

List even = toList(new Integer[] {2, 4, 6, 8});
List odd = toList(new Integer[] {1, 3, 5, 7});

List sums(List... lists) {
    List out = new ArrayList();
    if (!isSameLength(lists))
        throw new RuntimeException("Lists must have same length.");
    int numargs = lists.length;
    int length = lists[0].size();
    for (int i = 0; i < length; i++) {
        int val = 0;
        for (int k = 0; k < numargs; k++)
            val += lists[k].get(i);
        out.add(val);
    }
    return out;
}

sums(even, odd);

//  ==> [3, 7, 11, 15]
      

The problem with this is that it isn't generalised.

Map in Java

First we'll need a something that acts as a first-class function.

Java
public interface Proc {

    public Object apply(Object... args);

}
      

Next, a static map method.

Java
public static List map(Proc proc, List... lists) {
    List out = new ArrayList();
    if (!isSameLength(lists))
        throw new RuntimeException("`map' requires lists of identical length");
    int numargs = lists.length;
    int length = lists[0].size();
    for (int i = 0; i < length; i++) {
        Obj[] args = new Obj[numargs];
        for (int k = 0; k < numargs; k++)
            args[k] = lists[k].get(i);
        out.add(proc.apply(args));
    }
    return out;
}
      

Now, the multiple lists example again.

Java
List even = toList(new Integer[] {2, 4, 6, 8});
List odd = toList(new Integer[] {1, 3, 5, 7});

Proc sum = new Proc() {
    public Object apply(Object... args) {
        return (Integer)args[0] + (Integer)args[1];
    }
};

List sums(List... lists) {
    return map(sum, lists);
}

sums(even, odd);

//  ==> [3, 7, 11, 15]
      

Much shorter, but with some limitations:

  • It doesn't use generics, and therefore
  • requires casting in Proc implementations; and
  • it is hardcoded to return an ArrayList.
Parametrized map

First, Proc is changed.

Java
public interface Proc1<R, A> {

    public R apply(A arg);

}

public interface Proc2<R, A, B> {

    public R apply(A argA, B argB);

}

public interface ProcN<R, A, B,..., N> {

    public R apply(A argA, B argB,..., N argN);

}
      

And a parameterized map.

Java
public static <R, A, X extends List<R>> X map(Proc1<R, A> proc, X out, List<A> list) {
    for (A a : list)
        out.add(proc.apply(a));
    return out;
}

public static <R, A, B, X extends List<R>> X map(Proc2<R, A, B> proc,
        X out, List<A> listA, List<B> listB) {
    if (!isSameLength(listA, listB))
        throw new RuntimeException("`map' requires lists of identical length.");
    int length = listA.size();
    for (int i = 0; i < length; i ++)
        out.add(proc.apply(listA.get(i), listB.get(i));
    return out;

// And so on up to `map<R, A, B,..., N, X>.
      

The example would now be:

Java
List even = toList(new Integer[] {2, 4, 6, 8});
List odd = toList(new Integer[] {1, 3, 5, 7});

Proc2<Integer, Integer, Integer> sum = 
    new Proc<Integer, Integer, Integer>() {
        public Integer apply(Integer argA, Integer argB) {
            return argA + argB;
        }
    };

List sums(List listA, List listB) {
    return map(sum, new ArrayList(), listA, listB);
}

sums(even, odd);

//  ==> [3, 7, 11, 15]
      
The cost of parameterizing

The obvious limitation of the generic implementation is that arbitrary numbers of arguments are no longer supported in ProcN or map.

Reduce

reduce is easily written on top of Proc2.

Java
public static <R, A> R reduce(Proc2<R, R, A> proc, List<A> list, R initial) {
    int size = list.size();
    R out = initial;
    for (A a : list)
        out = fun.apply(out, a);
    return out;
}

List nums = toList(new Integer[] {1, 2, 3, 4});

Integer squareSum(List list) {
    return reduce(new Proc2<Integer, Integer, Integer>() {
        public Integer apply(Integer a, Integer b) {
            return a + (b * b);
        }
    }, list, 0);
}

squareSum(nums);

//  ==> 30
      

Friday, August 20, 2010

Why I like Factor

Factor is a relatively new language and implementation in the tradition of Forth, Lisp, and Smalltalk. Like Forth, Factor is concatenative and uses a postfix syntax. Also like Forth, Factor emphasises small procedures and constant refactoring. Like many Lisp and Smalltalk implementations, Factor compiles code into a loadable image. Like Lisp, Factor can perform arbitrary computation at compile time using macros while parse time evaluation allows the creation of new syntax forms.

Unlike Forth, Lisp, and Smalltalk, Factor is modern and unencumbered. Lisp suffers from an ossified specification, Smalltalk has lived inside its own image for so long it's out of touch with the rest of the world, and Forth provides few of Factor's higher level features like garbage collection and dynamic code updating.

Factor’s focus on correctness and efficiency is not commonly found in other modern ‘dynamic’ languages. The results speak for themselves.

1. Factor vocabularies are compiled to machine code. When using the interactive listener the code is compiled before execution (like SBCL).

2. Factor is fast. Not C fast, but SBCL fast.

3. Almost all structures are modifiable at runtime. This includes classes at any position in the hierarchy as well as FFI bindings. When reloading a modified vocabulary, definitions no longer present will be deleted from the running image. These changes are propagated to all dependent code and the changes to the running image are kept consistent.

4. A simple foreign function interface. No quirky header files or IDL necessary; everything is done from within Factor. No reloading of Factor necessary either: create bindings incrementally and test them as you go.

5. Deployment images. Factor’s deployment tool only loads code the resultant binary will run. Minimal image size for a hello world program is a few hundred kilobytes.

6. Common Lisp style condition system. Don’t let stack unwinding lose an exception’s context. Recover anywhere between where the exception is caught and where it was thrown.

Why having a competent designer pays off

In contrast to other popular languages like Ruby and Python, Factor does not suffer from limiting creeds like, “There should be one way to do it,” or from BDFLs who don’t see the value in useful language constructs discovered as early as the 70s. Here's the payoff...

7. Correct lexical scoping. Usually in Factor one uses its postfix syntax and data flow combinators, however it also sports a locals vocabulary defined entirely in factor which implements lexical scoping and lambdas. Despite the fact that this is an addition in a language where lexical variables are not the default, Factor's lexical scoping is correct and its lambdas are uncrippled. Python 2.x is unable to rebind a variable in a parent scope (3.x overcomes this by introducing a new keyword, ugh!), and Ruby only recently (1.9) gave blocks their own scope.

8. Usable higher order functions. Combinators are used in Factor all the time. For some unfathomable reason Ruby allows only one block argument per function while in Python the use of such esoterica as map and reduce is discouraged.

9. Tail call optimization. Yep, that’s right, that helpful recursion thingy. Another thing Python doesn't have because, oh I don’t know, ask Guido. Come to Factor and free your mind from loopiness!

10. Low, hi, and FFI. Factor scales well from highly abstracted garbage collected code, to micro optimization using inline ASM. Along the way, the FFI allows Factor quotations to be used as C callbacks, and provides factor-side memory allocation should you need to store values on the stack or in a memory pool.

11. Methods are orthogonal to classes. Generic words and their methods are defined outside classes. Adding a generic word to a preexisting class is as simple as defining it. Extensibility without nasty monkey patching or name clashes, and a nice fit with a hierarchy that can contain anything from tuples, to predicate classes and C structs. No hideously bloated base-class needed (I'm looking at you Smalltalk). If Programmer Pooh adds a display-in-opengl method to object he need not modify core code, nor will his method clash with any present or future methods on object.

Why having a smart community pays off

Smart, or at least willing to learn new things.

12. No whiners! Try educating a bunch of Java programmers in the utility of higher order functions or tail-recursion...

13. Macros, syntax, and combinators, oh my! If you don’t whine, you get cool stuff. Macros akin to those in Common Lisp, but hygienic like Scheme, because if you don’t have variables, you can’t capture them. Syntax words like Lisp’s reader macros, but better because no dispatch character craziness is necessary. Observe the usefulness of regular expressions that are syntax checked at parse time, or the EBNF vocabulary which compiles an inline EBNF description into parsing code. For higher order goodness check out Factor’s unique dataflow combinators. This is not your ancestor’s stack shuffling!

More stuff I like

14. Live documentation. Factor has live documentation ala Emacs, but prettier and better linked. The documentation is contained in separate files to the code it describes. For those of us who hate hunting for scratchings of code amid screeds of API comments, this is a good thing.

An acceptable Lisp?

Someone once wrote about Ruby being an acceptable Lisp. They were wrong. Ruby doesn’t have macros, reader macros, native compilation, or a REPL from which everything can be modified. (Oh, but it does have three different kinds of lambda!)

Factor isn’t an acceptable Lisp either. Factor is a mighty fine Lisp. Factor is better than Lisp. It has all the things that make Lisp great and more. Factor will make your code beautiful. Factor will cook you breakfast. Factor reads like English, (lisp (like (not))). All hail Factor!

Summation

I like Factor because it hits the sweet spot of pandering to nothing but being a great language. It lifts much of the good stuff from great languages of yore and gives them an improved spin. It hasn’t yet succumbed to the whining hordes of mediocrity. It is written by a team that really know what they are doing. If that doesn’t describe your language, then give Factor a try.