Planet Octave

July 12, 2014

Eugenio Gianniti

Factories for assembly

fem-fenics provides a couple of utilities for assembling matrices and vectors, which compose the algebraic system weak formulations are reduced to when applying FEMs. Left aside all the checks needed to verify inputs, their job boils down to creating proper DOLFIN objects to store these matrices or vectors, possibly applying boundary conditions, then building an Octave-friendly representation. This last task is quite critical for the implementation of the MPI parallelisation, as the underlying DOLFIN representation of matrices and vectors is transparently distributed among processes, thus making the serial code used to pass them on to Octave useless. Lately I implemented some new classes to manage this aspect, so I will highlight my design considerations.

The translation from DOLFIN's to Octave's data structures is logically a well defined task, whilst its implementation needs to vary according to its serial or parallel execution. Furthermore, it strictly depends on the linear algebra back-end used, for each of them stores a different representation and exposes a different interface to access it. To address these difficulties, I wrote a hierarchy of factories to provide the adequate methods, based on the run-time necessities. Moreover, this way the code is easily expandable to allow for more back-ends to be used in fem-fenics (currently only uBLAS is available). There is an abstract class, to declare the interface of its derived ones, and a concrete factory implementing the uBLAS-specific methods.

Since in a future fem-fenics there will be several algebraic back-ends available for use, the hierarchy will expand. This means that the checks of the run-time configuration will eventually become more complex. Another issue comes from the need to use different types depending on information available only at run-time. Both to encapsulate those checks, avoiding code duplication, and to solve the problem of choosing the right class, I added to the hierarchy a class implementing the Pimpl idiom. With this design, the "user code" in the C++ implementation of assemble.oct and assemble_system.oct needs just to create a femfenics_factory object and use it to extract the data structures of interest, while every other hassle is dealt with behind the scenes by this class.

UML diagram of the new hierarchy

In the diagram above you can see the already implemented classes and an example class to point out where others will collocate amongst them. femfenics_factory has private methods to check which is the right concrete class to use each time, and implements the public methods of the abstract class dispatching the call through a reference. uBLAS_factory, as other concrete classes are expected to do, holds the real code for creating Octave matrices and vectors and exposes a static method, instance, which allows for access to the singleton object of this type. femfenics_factory, in turn, obtains with it the reference needed for dispatching.

by Eugenio Gianniti ( at July 12, 2014 06:40 PM

MPI and the problems to face

These days I have started my investigations on the actual implementation of the MPI parallelisation in fem-fenics. I found out some information that I will point out here, together with my goals for the next weeks.

First of all, apparently MPI can be used without user intervention on the serial code. This is a feature that DOLFIN boasts, but I would expect it not to pass on to fem-fenics, at least not without some effort on the implementation side. Furthermore, DOLFIN offers also a wrapper for MPI functionalities, thus probably it can be helpful in managing data transfers among threads in C++ code.

An issue that will need to be addressed is making ufl.m robust to parallel execution, since its serial implementation leads to all workers trying to open the same file, thus leading to an error that stops computation. Anyway, even if they could all open the file and write to it, this would entail that lines are copied in random order or more than once, so it must be fixed.

In the end, it seems that the partitioning procedure produces matrices that are not slices of the one assembled in serial execution. Due to this fact, I must go deep in the algorithm to find out the proper way to merge the pieces and obtain the complete matrix, which will be stored as octave_value to allow for further computation using Octave's features.

by Eugenio Gianniti ( at July 12, 2014 04:37 PM

July 03, 2014

Eduardo Fernández

ILU function - implementation details

Hi all,

The purpose of this post is to explain the details behind the implementation of the ilu function, my work during this first period of GSOC program. The files involved are:
  • src/
  • src/
  • src/
  • ilu.m
You can pull the repo using mercurial from:

--> src/

This file contains the implementation of ILU0 algorithm, the easiest one. In this version the zero-pattern of the input matrix is not modified so it is known the final structure of the output matrix. That simplifies things. For the milu=['col'|'row'] option, it is needed to implement both the IKJ and JKI versions of the algorithm to efficiently compute the compensation of the diagonal element with dropped elements. I managed to do both in the same function, just changing a few lines of code. Lets use Matlab's documentation example:

Example 1:

A = gallery('neumann', 1600) + speye(1600);
setup.type = 'nofill';
setup.milu = 'row';
[L,U] = ilu(A,setup);
e = ones(size(A,2),1);

ans =    1.4660e-14 (Very low, good)

The following little function can be used, when milu = ['row'|'col'] to check that all the columns/rows preserve its sumation (not only with ilu0 but with iluc and ilutp). Just run it after calling ilu in any form.

benchmark/check_sums.m (It can be found here in the repo)

function check_sums (A, L, U, milu)
  b = L * U;
  dim = 1;
  if (milu == 'row')
    dim = 2;
  c = sum (b, dim);
  d = sum (A, dim);
  v = abs (c - d);
  num_zeros = length (find (v > sqrt (eps)));
  printf('Number of rows-columns not well compensated: %d\n', num_zeros);
  if (num_zeros > 0)
    v (find (v > sqrt (eps)))

NOTE: I have found in Matlab 2013a that the row and col sumation does not work well always, and the row and column compensation fails for ilutp and iluc. I will show an example later.

--> src/

 This algorithm is the trickiest one due to pivoting, and has caused me more than one headache during its coding because it is not well described in Saad's book, just a few indications. I have found here several bugs in Matlab's 2013a implementation that make me a bit reticent about trusting results correctness.

Error 1

A = sparse ([3 4 4 3 -1.1; 2 0 8 9 2.2; 2 1 9 9 1; 3.2 10 2.3 2 4.5; 9 2 6 2 1]);
setup =
       type: 'ilutp'
       milu: 'col'
       droptol: 0.2000
       thresh: 0
       udiag: 0

>> [L, U, P] = ilu(a,setup);

sum(A(:, 2))    => 17
sum(L*U(:, 2) => 14.4857

Clearly the sum of the second column is not preserved :/.

Error 2

A = sparse([3 1.5 0 0 1.44; 0 1 0 0 -2;0 0 8 0 0; 0 2 0 2 -4.5; 0 -1 0 0 1]);

setup =
       type: 'ilutp'
       milu: 'col'
       droptol: 0.5000
       thresh: 0.2000
       udiag: 0

>> [L, U, P] = ilu(a,setup);

The output is:

U =                                                                         
    3.0000    1.5000  0        0         0
         0         0         0          0         0
         0         0     8.0000    0         0
         0         0         0    2.0000    Inf
         0         0         0         0       -Inf

L =
     1     0     0     0     0
     0     1     0     0     0
     0     0     1     0     0
     0   Inf    0     1     0
     0     0     0     0     1

What are those Inf doing there? Clearly the are not detecting correctly 0 pivots.

Error 3

A= sparse([3 1 0 0 4; 3 1 0 0 -2;0 0 8 0 0; 0 4 0 4 -4.5; 0 -1 0 0 1]);

setup =
       type: 'ilutp'
       milu: 'row'
       droptol: 0
       thresh: 0
       udiag: 0

>> [L, U, P] = ilu(a,setup);


L =

    1.0000         0         0         0         0
    1.0000         0         0         0         0
         0             0    1.0000     0         0
         0    1.0000         0    1.0000     0
         0   -0.2500         0         0    1.0000

 That 0 cannot be there. By construction L has to be a lower unit triangular matrix and that zero element spoils the L*U product. Again WRONG.

I have encountered more issues when testing Matlab using some testing matrices with 2000x2000 and 5000x5000 dimensions. With them my output is not the same as Matlab's (nnz of L and U are different from Matlab's), but taking into account the errors I found, I trust the most my version and not theirs. BTW in my case the rows and columns sums were preserved, theirs not. Obviously I have checked that those examples behave correctly in my code detecting 0 pivots

A similar example can be run as with ilu0:

Example 2:

A = gallery('neumann', 1600) + speye(1600);
setup.droptol = 1e-2;
setup.type = 'ilutp';
setup.thresh = 0.5;
setup.milu = 'row';
[L,U] = ilu(A,setup);
e = ones(size(A,2),1);

ans =  2.5170e-14 (Nice) 

Pivoting: It worths to mention how pivoting is performed in that algorithm. When milu = 'row' the U matrix is column permuted (IKJ version used) but when milu=['off',|'col'] L is the permuted one and it is row permuted (JKI version used). Both algorithms share a lot of similarities and the code is designed to work in one version or another depending on milu option. That way code duplication is avoided. That was one of my primary fears when I realized that both versions were needed to attain Matlab compatibility. 

--> src/

This is the file containing the crout version of ILU. This version is an enhancement of pure IKJ and JKI variants of gaussian eliminations. At iteration k the k:n section of k column and k row is computed. The enhancement is noticed in the execution time for the same input matrix. The following example is a comparison between my versions of ilutp and iluc:

For a 2000x2000 matrix ( I have not included this matrix in the repository due to it size):

With setup.droptol = 0.01 and setup.milu = 'off'.

ilutp  -->  12.3458 seconds
iluc    -->  6.31089 seconds

ilutp  -->  12.868686 seconds
iluc    -->  7.498106 seconds

That is just to illustrate the performance of different versions.

NOTE: In iluc the dropping strategy for elements in U (stored as CRS) is to drop the element aij if (abs(aij) < droptol * norm(A(i, :))). For the L part (stored as CCS)  aij is dropped if (abs(aij) < droptol * norm(A(:, j))).

Finally the numeric example:

Example 3:

A = gallery('neumann', 1600) + speye(1600);
setup.droptol = 1e-2;
setup.type = 'crout';
setup.milu = 'row';
[L,U] = ilu(A,setup);
e = ones(size(A,2),1);

ans =  2.5212e-14 (Nice)

That is all I wanted to show till now. I have written tests for the functions and adapted several ones from Kai last year work. However I want to add some more function-specific ones for validating results. The last thing pending is to place the source files inside the Octave source tree. I am not totally sure where they should go. On the other hand I have already started to work on ichol function and next week I'll report about my progress.

I know the post is a bit long but I think it is needed due to the poor verbosity I had through the blog during this period. I am aware of that (Jordi pointed me out a few days ago) and I will take into account for the following weeks.



by Eduardo ( at July 03, 2014 04:09 PM

June 27, 2014

David Spies

Midterm Blog Post

I’ve neglected this blog.  Sorry

What I’ve done so far:

Changeset 1: Removed second internal representation from PermMatrix (all PermMatrix are strictly column-major now): Accepted!

Changeset 2: Created a generic templated “find” method which consolidates the many different implementations of “find”.  This change includes adding nz_iterators, ie iterators over the nonzero elements of any of the four array types (Array, Sparse, DiagArray2, PermMatrix): Pending jwe’s review

Changeset 3: Added a “dispatch” function which dispatches interpreter function calls to the proper matrix template-type.  jwe warned that the community might be uncomfortable with such a major change, but so far the community’s been almost entirely unresponsive except for one enthusiastic and supportive email from Carne. (Thanks!)

Note, these first three changesets have not changed significantly since I implemented them in the FIRST WEEK of GSoC, yet only one of them has actually been fully reviewed and accepted so far.


Changeset 4 WIP: Started working on an “accumulate” function to use as the back-end implementation of sum, product, any, all, min, and max.  In particular I was writing a version that will properly accumulate over the rows of tall sparse matrices.  I stopped this in the middle and figured I’d come back to it.  I still plan to

Changeset 5: Added bounds-checking and reference-checking to Array::xelem (when debugging options are turned on).  I’ve been advised I should profile this change, but there aren’t any recommendations on what benchmark to use.  Experience suggests whatever benchmark I use, the community won’t accept it.

Changeset 6 WIP: Logical indexing into sparse matrices.  This is the primary change my GSoC project is meant to be centered around.  I have implemented a form of it, but it doesn’t yet handle any of the edge cases.  Also my implementation relies heavily on the iterators and dispatch function from changesets 2 and 3 respectively so if jwe ultimately doesn’t accept those, then I don’t know what I’ll do.

Experimenting with Matlab shows that Matlab will reshape the index matrix to match the height of the result matrix and Matlab’s reshape apparently use >64-bit arithmetic to determine the size of the resulting matrix.  This leaves me in somewhat of a bind because if I want to do the same, I need to add gmp as a dependency, but no one wants to add new dependencies.

Considering that in this particular instance, I’m guaranteed to know the column height (and nobody asked me to re-implement reshape so I won’t), I think I’ll just use repeated addition/subtraction to do the multiply and and divide.  It will still run in only O(#columns) running time which is how long a reshape should take.


On one hand, I’m nearly done with the primary thing my project is supposed to do.  That’s good.  But on the other hand I should have finished it and have moved on to other stuff by now.  I understand that community/mentor approval is an important thing, but if the community is really so strict, then why is Octave such a mess of buggy copy-pasted ill-considered code?  Who approved all of that?


by dnspies at June 27, 2014 05:43 PM

Eduardo Fernández

Mid-term pre-post.

Hi all,

This is a short post just to clarify my state at midterm. As I have arranged with Kai, at the beginning of the next week I will write a verbose post to explain all the details related to the development of "ilu" function. I need a couple of days to tidy up all the code and make it presentable.

The state of the code now is functional. It lacks of tests and documentation and need a bit of re-factorization I will do this weekend. I will list several relevant points about the implementation of the function.

  • It is programmed to output the same as Matlab.
  • My version is at least as fast as Matlab's, outperforming by small amounts of time in large output cases.
  • I found a bug. At least on Matlab2013a regarding "ilutp" option when using milu="col". The col sum is not preserved in at least one case so I found in my testing cases that my function does not output the same. I will explain with more detail that issue next week.
 In the end I am happy with the performance of the function and that I'm meeting the time-line proposed at the beginning of GSOC (with a few days of delay maybe).

Future work: The second part of the GSOC I will implement "ichol" function. There are several points to discuss about its development with Kai because he implemented the code last year but there were some kind of issues with the licensing of it. This period is a bit longer and I will have no classes nor exams. Because of that, if I have time remaining at the end, I can start implementing "minres" or "lsqr" algorithms that Octave lacks of too. So there will be no time wasted.

See you,


by Eduardo ( at June 27, 2014 12:09 AM

June 25, 2014

Eugenio Gianniti

Mid term accomplishments

I will try to give a comprehensive feel of what I achieved in this first part of the Google Summer of Code, since it is time for the mid term evaluation. Let's start with an example: as usual, it is the Poisson equation, but today, as a twist, we consider a fully Neumann problem. In order for such a problem to be well posed there is the need of an additional constraint, otherwise the solution would not be unique, so in the Octave code there is the Lagrange multiplier cHere you can find more details and the C++ and Python code, I will just write down the differential problem for convenience:

- Δu = f in Ω
u ⋅ n = g on ∂Ω

Here is the Octave code that solves the above mentioned problem:

pkg load fem-fenics msh

ufl start NeumannPoisson
ufl CG = FiniteElement '("CG", triangle, 1)'
ufl R = FiniteElement '("R", triangle, 0)'
ufl W = CG * R
ufl "(u, c)" = TrialFunctions (W)
ufl "(v, d)" = TestFunctions (W)
ufl f = Coefficient (CG)
ufl g = Coefficient (CG)
ufl a = "(inner (grad (u), grad (v)) + c*v + u*d)*dx"
ufl L = f*v*dx + g*v*ds
ufl end

# Create mesh and function space
x = y = linspace (0, 1, 33);
mesh = Mesh(msh2m_structured_mesh (x, y, 1, 1:4));

W = FunctionSpace ("NeumannPoisson", mesh);

# Define variational problem
f = Expression ('f', @(x,y) 10*exp(-((x - 0.5)^2 + (y - 0.5)^2) / 0.02));
g = Expression ('g', @(x,y) - sin (5.0 * x));

a = BilinearForm ("NeumannPoisson", W, W);
L = LinearForm ("NeumannPoisson", W, f, g);

# Compute solution
[A, b] = assemble_system (a, L);
sol = A \ b;
solution = Function ('solution', W, sol);

u = Function ('u', solution, 1);

# Plot solution
[X, Y] = meshgrid (x, y);
U = u (X, Y);
surf (X, Y, U);

At the very beginning you can see a block with every line starting with ufl. That is what you would have to put in a separate UFL file before this summer. In a sense it is not plain UFL, but there are extra quotes and apices. They are needed because, using the current version of Octave, those brackets with commas inside would otherwise be interpreted as function calls. After this blocks closes with the ufl end line, the resulting UFL file is compiled to obtain a FunctionSpace, a BilinearForm and a LinearForm. These are oct-files that fem-fenics will use later on to define the corresponding variables in Octave. A robust implementation of ufl.m, the function that provides this binding to the UFL language, is one of the results of the first term.

In the end of the snippet you can see that the solution u is evaluated in its domain exactly as you expect to do with a regular function taking two arguments and returning one value. This is due to the new subsref method of the function class, which is used to represent the elements of a function space. Aside from surface plots, this feature can be of interest to generalise methods that rely on analytical solutions to differential problems, or to apply basically any algorithm to such functions. Here is the plot you will obtain with this script:

I wrote in an earlier post of the interpolate function: with this you can get the representation of a Function or Expression on a given FunctionSpace. It is useful, for instance, to compare your numerical solution with an exact one you happen to know. Or, in the example above, you might want to view what is the forcing term like:

f_cg = interpolate ("f_cg", f, u);
F = f_cg (X, Y);
surf (X, Y, F);

There is one last achievement to highlight for the mid term evaluation: currently both the initial compilation of the package and all the ones performed just-in-time when importing UFL instructions proceed smoothly without user intervention. To this end, now the build system relies on pkg-config to get at once all the flags needed for proper compilation and linking, since some dependencies of dolfin, the FEniCS interface, are not to be found in standard directories. In order to exploit the extracted information also for the subsequent run time builds, the autoconf substitution is performed also in the get_vars.m auxiliary function, which in turn provides it to generate_makefile.m. An implementation detail that proved quite tricky is how to pass all the preprocessor flags to mkoctfile: only a subset of the options of g++ are hard-coded in it, so I needed to resort to a workaround. Indeed, CPPFLAGS are always passed as environment variables and not as command line flags, so that mkoctfile will just copy and deliver them to the real compiler.

To further enhance the build system, I implemented other internal functions that hash the UFL file that was compiled and, later, check it to understand if it changed between the previous and the freshly requested build. In the example above, you will find in your working directory four new files after a run: the three already mentioned oct-files and a text file storing the md5 sum of the UFL that has been imported. Until one of these files gets somehow deleted or the problem in the ufl block changes, you will not need to take on a time consuming compilation any more.

by Eugenio Gianniti ( at June 25, 2014 09:13 PM

June 23, 2014

Eugenio Gianniti

Function evaluation

As said in the previous post, my latest result is the possibility to evaluate a fem-fenics function on a point of its domain. This way it is possible to generalise methods that, otherwise, would rely on analytical solutions. In [1] we have an example of such a method.

The paper deals with the deposition of nanoparticles in tissues, for the treatment of cancer. The phenomenon is described with a Monte Carlo simulation of these particles' trajectories, assuming that the velocity field of the carrying fluid is known. In this study, some simplifying hypotheses about the geometry of the cells and the fluid layer nearby allow for an analytical solution of the Stokes equation. Unfortunately, these assumptions do not hold generally in human tissues: for instance, in the liver cells have cubic shape, contrasting to the spherical one used in this paper. Now, if this method is implemented in Octave, we can solve numerically the Stokes equation on a realistic domain and obtain right away a more general approach to this significant application.

The evaluation of a fem-fenics function was already possible via the feval method, but it had some glitches. One aspect is that the solution of a differential problem could not be used as if it was a regular Octave function, then a user should have adapted his/her algorithms to take advantage of it. One more critical issue is that the previous implementation did not handle the exception raised by the underlying FEniCS method when it gets as argument the coordinates of a point outside of the domain, thus leading to a crash of Octave.

In order to address these problems, I added the subsref method to the function class and implemented the proper exception handling in feval. To avoid code duplication, the former relies on the latter for the real computation, so it basically just forwards the parameters after checking that the right type of indexing was used. As a result, it is now possible to solve the equations:

- ν Δu + ∇p = 0
∇ ⋅ u = 0

with relevant border conditions, on a proper mesh and finite element space, and then evaluate the solution with the Octave expression  values = u (points), where points is a matrix holding the coordinates of every point where to do so, one per column. Moreover, a careless evaluation will not result in your Octave session crashing any more.

Even if this feature of the package underwent some improvement, there is still room for more. Two issues I have not addressed yet are the somehow weird interface and the possibility to create a function handle to perform evaluations with. Regarding the former, we might observe that the above mentioned expression remains exactly the same no matter what the geometrical dimension of the domain is. I should modify the implementation so that a vectorial function on a 3D space is evaluated with [ux, uy, uz] = velocity (x, y, z). Moving to the latter, in my understanding the class design should be modified to allow the exploitation of the Octave internals managing functions, so this would require a careful reflection on all the possible collateral effects of such a change.

by Eugenio Gianniti ( at June 23, 2014 01:11 AM

June 16, 2014

Eugenio Gianniti

Goals for future development

The mid-term review is approaching, so it is time to highlight what is done, what is underway and what are the future goals. In this post I will try to do so as clearly as possible.

Mid-term review

The main effort during the first part of the Google Summer of Code was the implementation of the bindings for the UFL language in Octave. Now UFL code can be written directly in m-files, without the need of a separate file to define the problem. To this end, ufl has been implemented for opening a file, writing to it and importing the variational problem when it is complete.

Further, I implemented interpolate, which allows the interpolation of a Function or an Expression on a given FunctionSpace. This can be of interest to test the validity of a discretisation method, for instance if an analytical solution is available in closed form, so that it is possible to compare it with the numerically obtained one.

Lately, I focused on the build system, both for the package compilation and for the just-in-time ones needed to import variational problems in Octave. The former is now backed by pkg-config, so that all the proper compiling and linking options required by the dependencies are obtained at once. Thinking about the latter, this information is used to accordingly configure the get_vars function, which provides it to the one that generates the Makefiles used to compile oct-files just-in-time. In the end, currently these oct-files are compiled again only when necessity arises, for example if one of them has been deleted or if the UFL file has been changed.

In the upcoming week I will add another feature: it will be possible to get a function handle for the evaluation of a Function. This way the solution of a variational problem can be used exactly as any other function in Octave, for instance allowing the generalisation of algorithms relying on exact solutions of differential problems, which are thus limited to simple cases. I will provide some details on an application in my post about this feature.

Final review

In the second part of the project I will be mainly committed to the parallelisation of the package execution via MPI. As noted in an earlier post, the parallelisation through the OpenMP paradigm has been quickly abandoned because it does not provide a significant performance gain, while opening the way to bugs and errors. Parallelism is, anyway, an interesting feature for the package's use cases, so it will be the main goal of the final hand in.

by Eugenio Gianniti ( at June 16, 2014 01:13 AM

June 15, 2014

Eugenio Gianniti

Just-in-time compilation

One of the known issues of the fem-fenics package was related to the errors during the just-in-time compilation, due to missing include directories. Among my preliminary contributions there is a changeset addressing the problem in the initial build of the package, when installing it into Octave. Now I went on and solved it also during the usage of fem-fenics.

At the moment of the first build, autoconf is in charge of finding out the relevant compiler and linker flags through pkg-config. They are, then, substituted in the Makefile, which compiles the package making use of them. This piece of information is needed also when an UFL file is imported and transformed into the oct-files used to transform the weak formulation at hand into an algebraic system, but until now the user had to supply it by the means of an environment variable.

Currently, I added a new utility function that provides those flags. In the configuration process they are substituted also in get_vars.m, which is called by generate_makefile.m when a differential problem is imported. The latter replaces two placeholders and writes the ad hoc Makefile with all the necessary compile and link options. This way users will not need to provide compilation flags anymore, instead the package will manage this aspect on its own.

As noted in a previous post, however, this just-in-time build is relatively time consuming, taking around half a minute each time. Nonetheless, a common usage pattern could entail the resolution of the same weak formulation on varying meshes or with different boundary conditions, forcing terms or physical parameters. Every mentioned situation does not need the recompilation of the problem's oct-files, since they carry information only about the function space and the formal expressions of the bilinear form and the linear operator. It is useful, then, to take on the build process only when needed.

To add this feature, I created three function to perform appropriate actions. After every successful just-in-time compilation, save_hash.m takes care of hashing the imported UFL file and writing the result to <ufl-filename>.md5sum. On the other hand, at the beginning of every import_ufl_*.m function, a check like this is performed:

         if (check_hash (var_prob) ||
             ! check_oct_files (var_prob, "Problem"))

You can see in it the remaining two functions implemented lately. The first one, check_hash.m, receives as argument the name of the variational problem, reconstructs the UFL file name, looks for a saved hash sum and compares it with the current file's. It returns true if the proper .md5sum file is not found or if the new and old hashes are not the same. Clearly, the oct-files should be rebuilt if one of them is missing: check_oct_files.m looks for the relevant files, with its second option stating which import is underway (thus, which files are expected as output), and returns true if they are all available.

by Eugenio Gianniti ( at June 15, 2014 04:18 PM


These days I have worked on the implementation of the interface to the OpenMP-powered assembly offered by FEniCS. Despite being potentially a one-line intervention, it proved quite tricky: indeed, with the needed addition, the fem-fenics function for system assembly broke with a huge number of run time errors, probably due to a change in the underlying data structure that is transparent to the library users, but does not go unnoticed if you need to access it directly, as fem-fenics does. This led me to leave this functionality behind.

My choice is backed by some computational experiments. They show that the approach enacted by the FEniCS library is quite effective, with times required for assembly reduced by half using four threads instead of just one. However, they are negligible compared to the linear system solve phase, even when the OpenMP parallelisation is disabled. I used a great number of mesh nodes in order to have meaningful timings: even if linear systems took some minutes for resolution, the assembly phase lasted as much as a couple of hundredth of a second in serial code. If we add to these findings that the fem-fenics package requires a just-in-time compilation lasting around half a minute, we understand that there is no point in devoting effort for the implementation of this feature.

by Eugenio Gianniti ( at June 15, 2014 12:49 PM

Eduardo Fernández

ILUTP ready, steady, .... almost go.

It has been a bit more than two week since my last posting. I just wanted something solid enough to show before doing it again :). Because one image is better than a 1000 words. This is the state of my project till now:

In green color is what it is finished and working (obvious...) and in pink what it is partially finished. Red stuff is not working at all.

 ILUTP implementation:

As I did with ilu0 function, I started the implementation of ilutp using the IKJ variant of the Gaussian elimination as prof. Saad does in his book. For working efficiently with CCS(column compressed storage) structure of sparse matrices it is only needed a transposition before and after the process. So I came up with a working version without pivoting using this strategy a week before this post (src/ file in the repository). All OK till that point. Well ... it was not all OK. When pivoting comes into play, all get messy. It is not feasible to do row-pivoting efficiently after transposing the matrix and using the CCS structure with the IKJ algorithm. What I realized is that Matlab, by default, implements for milu="col" and milu="off" options a JKI variant of the algorithm. This way row- pivoting can be used and no transposition is needed using the CCS structure. So for the whole last week I had to almost rewrite entirely the function to implement it in the JKI way. That was a serious delay because I was not familiar with that variant. On the other hand I also got to the conclusion that  milu="row" option demands a IKJ implementation with column pivoting. It can be infer from the documentation:

 "....When SETUP.milu == 'row', U is a column permuted upper triangular factor.  Otherwise, L is a row-permuted unit lower triangular factor."

Column pivoting means that if CCS is used as storage structure (Octave does), the strategy must be to [transpose - perform IKJ algorithm with column pivoting - transpose again]. So it is needed another implementation. That is the reason milu="row" is not working with ilutp. I had no time to implement that variant with pivoting. However, I have half way traversed because of my early IKJ implementation. So I am working on it.

I am taking special care to output exactly the same as Matlab, that means figuring out some nuances of their implementation that can only be understood after trial and error experimentation with their ilu version. I tried to test intensively the function and for my test cases my version outputs the same as Matlab's.

I have integrated the ilu0 and ilutp function inside a m-file wrapper called ilu.m located in the root directory of the repository. The file was written last year by Kai and need to be changed a bit. But for now it is OK to provide a user-friendly interface to try my functions. Use it the same way as you were in Matlab.

A quick script to test it could be:

A = sprand(100, 0.5);
setup.thresh = 0.4;
setup.droptol = 0.005;
setup.type = 'ilutp';
[L, U, P] = ilu(a, setup);

To get the code pull  from here:

 Just execute make in the root directory and then open the Octave interpreter inside it too.

For the next week I am planning to finish the implementation for the milu option in both ilu0 and ilutp. (You can find the files as src/ and src/ in the project directory)

P.D: For who cares about performance ( I do), my version is a bit faster than Matlab's. You can try it for big matrices. I did, and for low values of droptol (means few terms of the matrix will be dropped), using pivoting and relatively big matrices (5000x5000) my version lasted around 200 secs and Matlab 220 secs. For a 2000x2000 one, the times were 19secs Matlab's, 13 secs mine. The numbers are just for you to get an idea. But they are good news.

See you!

by Eduardo ( at June 15, 2014 01:12 AM

June 12, 2014

David Spies

Generalized Dispatching

Note: Yech, the code formatting on this post came out terrible.  I’ll come back and fix it up later, I promise.


Almost immediately, when writing any function in Octave I run into a basic problem.  I’m not sure how many other people have recognized this problem, but it seems like no one’s yet bothered to deal with it.  So this seems like an unavoidable first task:


For efficiency purposes, everyone avoids OO polymorphism in favor of template-polymorphism (all the indirection inevitably will slow down processing).  But if I want to write a templated function for dealing with matrices

template<typename Mat>
octave_value_list myAwesomeFunction(Mat m)
// Does something awesome

It turns out to be hard to call this function when all I have is an object of type “octave_value”.

For every type I want my function to support I need to handle this explicitly:

callMyFunction(octave_value m)
myAwesomeFunction(m.type1Value ());
else if(m.isType2)
myAwesomeFunction(m.type2Value ());
else if...

You get the picture.  But it’s far worse than this, because with matrices, the types themselves are parameterized (by the element type), so now it looks more like:

octave_value_list callMyFunction(octave_value m)
else if(m.innerTypeIsB)
else if(m.isType2)
else if(m.innerTypeIsB)
else if...

And then this is re-written for every single core function in Octave.  Needless to say this process is buggy and there are many missing cases.  This is probably the primary reason why so many Sparse operations are broken.  The authors simply forgot to include isSparse in their list of checks.  Or if they remembered, then they forgot to include isDiag or isPerm.  And even if they managed to include all the matrix types, they might not have handled isComplex etc.

So I’m creating a utility file called “dispatch.h”.

The idea of dispatch.h is that it takes a template template parameter which is a functor to call back with the proper template argument.

Ideally, dispatch.h should contain the one-and-only implementation of that long if-else-if shown above and then other functions can call the dispatch function to get the right template type.  From there, they can use overloading, specialization typedefs etc. whatever they need to generalize it (or not).  I’m first implementing it by copying the if-else-if clause from the “find” method and then I’ll do the accumulate/reduce methods (any, all, sum, product, min, and  max)

If it can handle all those different functions, I’ll be satisfied that it does the right thing.  In this way, we can have code that doesn’t work sporadically depending on the type passed into it. ex.

octave:2> [a,b,c,d,e,f] = find(speye(10))
panic: impossible state reached in file 'corefcn/' at line 221
panic: Aborted -- stopping myself...
attempting to save variables to 'octave-workspace'...
save to 'octave-workspace' complete
octave exited with signal 6

(nargout wasn’t properly checked for sparse matrices, only full ones)


octave:1> [i,j] = find(eye(1000000))
error: out of memory or dimension too large for Octave's index type

(Someone forgot to handle diagonal matrices)If there’s one source of truth for what qualifies as “all-encompassing”, then the compiler will tell you when you forgot to handle this type or that type.  No more ad-hoc dispatching.

by dnspies at June 12, 2014 05:30 AM

June 11, 2014

David Spies

Duplication versus Boilerplate

In my mind there’s a big difference between duplication and boilerplate.

Code duplication occurs when two functions (or two blocks of code more generally) do approximately the same thing with minor variations.  If those two blocks of code occur in different places in the code-base, then likely when one gets changed (ie to fix a bug) the other will be ignored.  Such is the case with the old implementation of “find” in which there were multiple places where the dimensions of the return-matrix are computed (among other things).  When you eliminate code duplication, you generally have to make the replacement code more generic, but then you also need ways to construct specific instances to be passed to the generic code.

In an intelligent modern language like D, or a language which relies on higher-level functions like haskell, Ocaml, lisp (any functional language really), the story ends there.  You make your code more generic and pass the proper instantiations to it.

In an old cluttered imperative language like C++, constructing the proper instances of a generic function tends to require a lot of boilerplate code which is code that consists of many short blocks with the same structure.  Boilerplate occurs all in one place, and unlike with code duplication, there’s no danger of an unsuspecting future developer bugfixing your code by changing one block while forgetting to change the others.  But it does indeed include “duplicated code”.  The difference is, the code in question tends to be a single function call or a method header.  It’s short, and difficult to abstract any further (without changing languages).

If you see a line of code that looks like:
switch (n)
  case 1: return call_func<1> (args);
  case 2: return call_func<2> (args);
  case 3: return call_func<3> (args);
     error(“call_func should be called with a number from 1 to 3″);


please recognize this for what it is.  This is boilerplate.  It’s necessarily ugly because it’s written in C++, but it’s not “code duplication” in the same way two implementations of binary search in different files is “code duplication”.  You can be sure whoever wrote this code struggled with it and thought about it and determined that this was the most generic way to do whatever they were trying to do.


by dnspies at June 11, 2014 05:36 PM

June 09, 2014

Jordi Gutiérrez Hermoso

5 Things We Have Forgotten About Open Source

Note: in order to satisfy the exquisite tastes of today’s discerning internet readers, the following blog post is written in style.

We have been using open source for so long that we have forgotten, culturally, where it came from. It seems so natural and ubiquitous that we can no longer remember how things were before it. Some of us are young enough to have never even lived through times were open source wasn’t everywhere.

I am here to set the record straight on a few things, because I have noticed that even people who have lived through ye olden times have forgotten where things came from. Open source wasn’t spawned single-handedly by the sheer might of Linus Torvalds’s virtual gonads. Open source doesn’t mean that money is forbidden. Open source doesn’t mean that Richard Stallman is a twit.

1. “Open source” is a term coined by OSI

First things first, and the #1 thing most people have forgotten about open source: the term did not arise naturally. It was invented in early 1998 during the release of Netscape Navigator as the free Mozilla suite. The Open Source Initiative, composed of trailblazers such as Eric Raymond and Bruce Perens, decided that we needed a new name for what was about to happen. They got together with other people and Christine Petersen suggested the term, to much rejoicing. She then vanished back into the shadows and went back to being a nanotechnologist or something.

Open source was created... by a girl?!?!

Wait, wait, wait, let me get this straight. Open source was created… by a girl?!?!

OSI is an organisation that got together for a single purpose: to keep saying “open source, open source, open source” so much until everyone else was saying it too. This was all in February 1998, remember. That means open source is barely a year older than The Matrix. Neo had probably not even heard about it, because…

2. Nobody called it “open source” before OSI

The greatest testament to how good OSI’s marketing campaign was is that we have come to believe that the term is so natural that we always just called it that. They have convinced us all that “open source” was our idea, without needing to get into our dreams to do so.

... and from now on, you will worship penguins and red t-rexes.

… and from now on, you will worship penguins and red t-rexes.

Needless to say, it was not our idea. By far, the most common way to refer to “open source” before 1998 was “free software”.

Now, I know what you’re thinking. “Oh god, not this stupid flamewar again. Jordi, we know you’re a FSF-spouting propaganda drivel machine, why do you keep pushing the stupid term for open source that Richard Stallman keeps talking about?”

Wait, wait, hear me out. It wasn’t just Richard Stallman who called it “free software”. You know FreeBSD? The “free” in there doesn’t just mean “without a fee”. They really do mean free as in freedom. Or look at what OpenBSD calls itself a few times while rocking out to sweet, sweet, pufferfish freedom:

[...] we instead celebrate the 10 years that we have been given (so far) to write free software, express our themes in art, and the 5 years that we have made music with a group of talented musicians.

Here is a cartoony yellow pufferfish fighting a fearsome cartoony black blob... but is it art?

Here is a cartoony yellow pufferfish fighting a fearsome cartoony black blob… but is it art?

That’s right, even the biggest haters of the FSF and the GPL, and the most ardent opponents of His Exalted Bearded Gnuliness Richard the Stallman call themselves “free software”.

Amusingly enough, you probably never really noticed this, but the very same Mozilla for whom “open source” was initially coined, tried to call itself “organic software” for a while.

100% GMO-free. No pesticides. Hacked into being by loony hippies.

100% GMO-free. No pesticides. Hacked into being by loony hippies.

3. Open source has a precise definition

Now, here’s the thing: OSI didn’t just say, “here is open source, go wild and free, call anything you want open source!” Nope, in what might appear at first blush to be a cruel ironic twist, OSI did not make the definition of “open source” itself open source. In fact, they even trademarked “open source”, and ask that you only use the phrase according to their trademark guidelines!

Those controlling bastards trampling on our freedom with their smug little ®

Those controlling bastards trampling on our freedom with their smug little ®

Alright, so what does “open source” mean?

Well, in the beginning, Bruce Perens wrote the Debian Free Software Guidelines (there’s that pesky “free” term again). Then, he decided he was just going to grab those very same guidelines, run sed -i s/Debian/Open Source/g, and make that the official definition of open source.

This means that “open source” means a lot more than just “show me the code”. In particular it means that,

  • If you don’t let people sell it, it’s not open source.
  • If you don’t let people give it to their friends, it’s not open source.
  • If you don’t treat all receipients of your software equally, it’s not open source.
If you're not a pinko commie ideologue, it's not open source.

If you’re not a pinko commie ideologue, it’s not open source.

So why did OSI insist so much on a precise definition of open source? Well, because…

4. “Open source” is a synonym for “free software”

Okay, this is one that really gets people riled and the one where the flamewars arise. I am here to tell everyone that if you’re flaming over whether stuff is open source or if it’s free software, you guys need to chill the fuck out: everything that is open source is also free software, and vice versa.

I bet that declaration alone is gonna rile everyone up even more, eh?

This guy has tuned to perfect his built-in flamewar radar under his beard through years of hard labour in the Usenet grand banks.

This guy has tuned to perfection the built-in flamewar radar under his beard through years of hard labour in the Usenet grand banks.

Okay, let’s look at this from a different angle with an analogy. The issue here is with something that philosophers like to call intensionality vs extensionality.

You know how Canada is a constitutional monarchy, right? And you know how there is a Queen of Canada who is the head of government? The Constitution Act of 1867 establishes that Canada has a monarch. She has fun duties such as for example being assigned the copyright of anything an employee of Her Majesty’s Government does. Great fun, I once had someone send us Octave patches under the name of Her Majesty the Queen in Right of Canada.

An elite hacker if I've ever seen one.

An elite hacker if I’ve ever seen one.

Now, you might recognise that lady above, and you probably also know that England also has a queen, and by now my astute readers and you have doubtlessly put together that the Queen of Canada also happens to be the Queen of England. Two names for the same person!

However, Canada’s Constitution Act doesn’t actually specify “The Queen of Canada will be whoever occupies the position of Queen of England”. It just says that Canada has a queen and goes on to list the duties of said queen. This is called the intensionality, the words by which we describe what something is. The extensionality refers to the actual objects in the world that are described by these words. In this case, “Queen of Canada” and “Queen of England” could, perhaps, under some weird political shenanigans end up being two different people, but in practice they end up referring to the same person. So the extensionalities of “Queen of Canada” and “Queen of England” are the same.

Couldn't resist putting another picture of this lovely lady's stylin' fashion...

Couldn’t resist putting another picture of this lovely lady’s stylin’ fashion…

It is the same with free software and open source. The definitions look different, but in practice the software that they refer to ends up being the same. Oh, sure, there are some very minor disagreements over whether this or that license is OSI-approved but not FSF-approved or vice versa, but the whole point of coining “OSI” was to have another word to refer to “free software”.

In other words, it was always OSI’s intention for “open source” to be a synonym for “free software”. Hell, even Bruce Perens said so. Why did OSI want a synonym?

5. Open source came with certain promises

The whole point of coining the phrase “open source” was to push a certain point of view. The biggest proponent for the “open source” phrase was Eric Raymond. He and OSI have always described open source as marketing for free software.

So this marketing campaign came with certain promises, promises that we have forgotten were ever part of a marketing campaign by OSI, because they’re so ingrained into open source itself. Stop me if you’ve heard any of these before

  • Open source is a cheaper model to develop software
  • Open source ensures that software has fewer bugs, because more eyes can look at the source code
  • Release early, release often.
  • The best software is created by scratching an itch.

And so on… the whole point was to make free software attractive to business by de-emphasising the whole “freedom” part of it. Instead, OSI promised that by making your software open source, you would have better software, that open source was a better development model, leading to cheaper, less buggy software.

Less buggy? Really?

Less buggy? Really?

The “cheaper model” thing is also still a fairly popular meme nowadays. When you look at free projects in, one of the lines is how much money it would have cost to build this or that under some model called COCOMO.

I’m not trying to say that OSI is right or wrong about its promises. Some free software really is less buggy than non-free variants. It probably is way cheaper to develop Linux when all of the big companies chip in a few developers here and there to maintain it. All I’m saying is that we have forgotten that with the word “open source”, certain promises came attached to it. Some of these promises might even appear to be broken in some cases.

So next time you hear someone tell you that there will be fewer bugs and everyone will come sending you patches the moment you reveal your source code, remember that they’re repeating campaign slogans. And remember that even if those slogans might not always be true, there might be other reasons why you should give everyone else freedom to enjoy and distribute and hack your software.

by Jordi at June 09, 2014 04:37 PM

June 06, 2014

Jordi Gutiérrez Hermoso

X-Men: Days of Future Past, Explained in Mercurial Evolve

So this post made the rounds a couple of days ago, and it got me thinking… can Mercurial (hg) do any better? I think it can, especially with Evolve. Here is me describing how Evolve works:

As to the movie, if you have not seen it yet, you might want to wait until after you do, but the basic gist is a time-travel plot where they go back and fix timelines.

In the beginning

History is terribly wrong, an awful, crippling bug has been discovered way back in history, and it’s so terrible that a big chunk of current history has to be thrown out. Someone created evil sentinels, so evil that they decided to exterminate all mutants and most humans.

Finding the problem

Everyone digs back through the logs to find the cause of the problem. They know everything is bad now,

$ hg bisect --bad

but remember that some time in the past it was ok

$ hg bisect --good xmen-release-1.0

After some discussion,

$ hg bisect --good
$ hg bisect --bad
$ hg bisect --good
$ hg bisect --bad

the problem is revealed:

The first bad revision is:
changeset:   1024:0adf0c6e2698
user:        Raven Darkhölme <>
date:        Fri May 18 12:24:50 1973 -0500
summary:     Kill Trask, get DNA stolen

A bookmark is placed here for future reference

$ hg bookmark mystiques-first-kill -r 1024

Preparing Wolverine

Professor X and Magneto brief Wolverine on his impending task. The history has been made public, but the situation is so hopeless that hg admin Kitty Pryde decides to operate on Wolverine’s repo, the only one that could withstand the changes:

$ cd /home/wolverine/xmen
$ hg phases --draft --force -r 'descendants("mystiques-first-kill")'

Now Wolverine’s repo can endure any change. It’s a desperate move, but these are desperate times. Kitty sends Logan back:

$ hg update -r mystiques-first-kill

Making the fixes

Wolverine dispatches some minor thugs and squashes a few bugs, but the first change needs to alter the timeline,

$ hg amend -m "Attempt some wisecracks with some thugs"
137 new unstable changesets

Now all of the history that was based on top of this commit is unstable. It’s still there, for now, but things are rocky. Sentinels are approaching in the bad future and might kill everyone. Shit will get real there.

That’s ok, though, Wolverine is badass, doesn’t give a fuck, and goes about his business,

$ hg ci -m "Psychoanalyse Charles Xavier"  #Acceptable spelling for a Canadian
$ hg ci -m "New recruit: Peter Maximoff <>"
$ hg ci -m "Use Quicksilver to rescue Magneto"
$ hg ci -m "Stop Mystique from killing Trask (WIP)"
$ hg ci -m "Stop Mystique again from killing Trask"
$ hg fold -r .^ -m "Stop Mystique from killing Trask"
$ hg ci -m "Get metal painfully inserted into body. Then get drowned for good measure"

He decided that he didn’t want two separate commits for the same effect of stopping Mystique, so he folded those two commits into one. This is ok, because he’s still in draft mode.

Shelving working changes

Now Wolverine can’t do much about his current situation, and it’s up to others. So he decides to put his memory away for a while,

$ hg shelve

and now it’s up Mystique’s less buggy version, disguised as Stryker, to revive Wolverine,

$ hg ci -m "Rescue Wolverine from only thing that *might* kill him"

and a whole lot of other merry developments happen offscreen:

$ hg ci -m "Rebuild the school"
$ hg ci -m "Get new recruits"
$ hg ci -m "Everyone's happy"
$ hg ci -m "Etc, etc"


At this point, the unstable history with the bad timeline is no longer needed. If the X-Men had wanted to keep any part of it, they might have used the hg evolve command, but they just want to forget the whole mess

$ hg bookmark --delete mystiques-first-kill
$ hg prune -r "unstable()"

and the whole thing just fades away. Wolverine reawakens in the future, along with his memories,

$ hg unshelve

and it’s up to him and future Professor X in the good timeline to fix all the merge conflicts that will ensue from this unshelving.

by Jordi at June 06, 2014 03:56 PM

June 03, 2014

Eugenio Gianniti

Towards a parallel fem-fenics

These days I have been studying the documentation of the FEniCS project, mainly the FEniCS book, in order to understand the features related to parallel execution that it boasts. This preliminary study is aimed at adding them to the fem-fenics package. First of all I will summarise my findings, then I will comment the problems I need to address to implement this functionality.

Parallelism in FEniCS

FEniCS implements parallelism in such a way to be transparent to the user of the library. Moreover, it scales on different architectures, ranging from multi-core personal computers to distributed clusters. To this end, FEniCS makes use of two paradigms, which can be exploited both separately and together.

The first approach is tailored for shared memory architectures, such as the vast majority of the PCs nowadays, but also in many cases each node of a computational cluster. The implementation is based on OpenMP, and adding a simple instruction one can enable parallelisation to speed up the matrix assembly phase. It should be noted that this paradigm has little support in the underlying linear algebra libraries, so the resolution phase can take advantage of multi-threading only with the PaStiX solver. Since in a shared memory model parallel programs might suffer race conditions, the mesh is coloured to identify subsets, so that no two neighbouring elements belong to the same set. Obviously, the notion of proximity depends on the particular function space, then this is considered in the colouring algorithm. The assembly proceeds iterating over colours and splitting their nodes among threads: with this technique race conditions are avoided and the user can enjoy the benefits of parallelisation without incurring in unpredictable behaviour.

Contrasting to the first approach, the second paradigm is based on MPI and addresses the needs of distributed memory architectures. Unfortunately, the latter is less immediate than the former, requiring a DOLFIN program to be launched with the MPI execution utility, but in this case the code need not be modified. In this implementation, the mesh is split so that each process gets its part of it, with an algorithm striving to minimise inter-process communication. With scalability in mind, no single process holds the full matrix and, moreover, everything happens behind the scenes: this way the user has no need of taking care of low level issues. The distributed memory paradigm is diffusely supported in the algebraic back-ends, so it allows the usage of several solvers, both indirect and direct. As already noted, this and the previous approach can be combined, for instance distributing the computation on a cluster and further speeding up the assembly process enabling multi-threading within each node, provided they are multi-core machines. 

The implementation in fem-fenics

The shared memory paradigm should be quite straightforward to implement in fem-fenics. I expect to operate on a couple of functions: the private generate_makefile and the two assemble and assemble_system. The former should have the proper compilation flag (-fopenmp) added. The latter should have a new line reading like:

dolfin::parameters["num_threads"] = femfenicsthreads;

The number of threads could be passed to those functions as an argument, but this would ruin the interface compatibility with FEniCS, so this is a poor approach. Another way of addressing the issue is to define a global Octave variable in PKG_ADD and store in it the desired number of concurrent threads to use for the assembly.

The implementation of the distributed memory paradigm, instead, seems quite tricky. Basically, Octave does not use MPI, at least not Octave core. Nonetheless, there are two Forge packages with this goal, mpi and parallel. I will go through the documentation of these packages to understand if and, in case, how they address the problem of launching the oct-file with mpirun or mpiexec. Even leaving this aspect aside, I still do not know how easily the distributed objects storing matrices and vectors can be accessed to obtain the whole data.

In conclusion, I will initially work to add shared memory parallelism, at the same time looking deeper into the issues related to the distributed memory paradigm, which I suspect of being more than the ones highlighted.

by Eugenio Gianniti ( at June 03, 2014 02:19 AM

May 26, 2014

Eduardo Fernández

Weekly post: Matlab ilutp behaviour disclosed and implemented in a m-file.

Here I am again.

This week has been more about researching than coding. I have finally been able to reproduce the output from the ilutp(ilu with threshold and pivoting) Matlab's algorithm with an m-script (named ILU_pc.m in my project's directory). The fact is that Matlab does not implement the algorithm as is described in  Yousef Saad's book in a few ways. Because of that I had to do  reverse engineering, testing many cases and matrices. That is the function, ugly as hell, but is just for testing purposes.

function [A, P] = ILU_pc(A, tau, thresh)

  B = A;
  n = length(A);
  P = speye(n);
  for i = 1:n
    for k = i:n
      A(k:n,k) *= thresh;
      A(k,k) /= thresh;
      [m,mi] = max(abs(A(k:n,k)))
      A(k,k) *= thresh;
      A(k:n,k) /= thresh;
      mi = mi + k -1;
      tmp = A(mi,:);
      A(mi,:) = A(k,:);
      A(k,:) = tmp;
      e = speye(n);
      e(mi,mi) = 0; e(k,mi) = 1;
      e(k,k) = 0; e(mi,k) = 1;
      P = e*P;
    for k = 1:i-1
         if ( (A(i,k) == 0) || (abs(A(i,k)) < (tau*norm(B(:,k)))))
            A(i,k) = 0;
         A(i,k) = A(i,k) / A(k,k);
         A(i,k+1:n) = A(i,k+1:n) - A(i,k) * A(k,k+1:n);

  for i = 1:n
    for j = i+1:n
      if (abs(A(i,j)) < (tau*norm(B(:,j))))
        A(i,j) = 0;

  • The next goal to achieve is obviously to implement the function as .oct file translating this algorithm into a sparse one using Octave's internal data types. 
  •  All the testing I did was at college using their Matlab license. That delayed me because I couldn't do almost nothing in the weekend. Now I have a function that reproduce the behavior of Matlab's version I can test against it my c++ code.
 See you next week!

by Eduardo ( at May 26, 2014 03:54 PM

May 23, 2014

David Spies

Mathematical meaning of the word “has”

I’ve been working on some stuff.  I’ll get to that in my next post.  For the sake of having something written, I’ll pontificate on what “has” means as it caused some confusion for me while working with Octave.

After a brief discussion with John on the Freenode channel, I discovered that my idea of the word “has” is different from in Octave terminology.

If every full carton has a dozen (12) eggs and a fridge drawer has three full egg-cartons, then the drawer has 3×12 = 36 eggs.

We can also work backwards.  If a fridge shelf contains only full cartons of eggs and we know that it has 36 eggs, then it must be the case that this shelf has 36/12 = 3 cartons

But for 3-dimensional matrices, this is not so.  How many columns does this matrix have?

octave:1> zeros(2,2,2)
ans =

ans(:,:,1) =

0   0
0   0

ans(:,:,2) =

0   0
0   0

Well, every column has 2 elements and the matrix has 8 elements total so I would say it has 8 / 2 = 4 columns (we could also say it has 2 pages and each page has 2 columns so there are 2 * 2 = 4 columns).

But although everyone agrees on the definition of a column (even in the three-dimensional setting; a single column is of the form A(:,i,j) for some i and j), apparently the definition of the word “has” is overloaded when talking about matrix dimensions.  In Octave terminology, the above matrix has only 2 columns, which means “the length of the row-dimension is 2″.  Equivalently, A has x rows translates to “the length of the column-dimension is x”

I don’t know if this terminology generalizes.


by dnspies at May 23, 2014 08:00 PM

Eugenio Gianniti

ufl binding

This week I started my work on the ufl function: it is now possible to write ufl code on-the-go, directly in your m-files. You can see below how the Poisson.ufl file of the homonymous example provided with fem-fenics (on the left) can be translated to a snippet of Octave code:

# Copyright (C) 2005-2009 Anders Logg

element = FiniteElement("Lagrange", triangle, 1)

u = TrialFunction(element)

v = TestFunction(element)
f = Coefficient(element)
g = Coefficient(element)

a = inner(grad(u), grad(v))*dx

L = f*v*dx + g*v*ds
# Copyright (C) 2005-2009 Anders Logg
ufl start Poisson
ufl element = FiniteElement("Lagrange", triangle, 1)
ufl u = TrialFunction(element)
ufl v = TestFunction(element)
ufl f = Coefficient(element)
ufl g = Coefficient(element)
ufl a = inner(grad(u), grad(v))*dx
ufl L = f*v*dx + g*v*ds
ufl end

How to use ufl

Basically, you just need to prepend what you would have written in your .ufl file with ufl. As you can see, anyway, there are also two new instructions. fem-fenics still needs to store your code in a separate file, which is then compiled using ffc, the FEniCS form compiler, but now ufl takes care of the process.

Your code should begin with the start command, and optionally with the name you want to assign to the file: in this example, we choose to open a new Poisson.ufl file. Be aware that ufl will not overwrite an existing file so, if you plan to use your script for several runs, my suggestion is to keep your working directory clean and tidy with a delete ('Poisson.ufl') after the snippet above.

When you are fine with your ufl code, the end command will tell ufl that it can compile and provide you with your freshly built problem. You can also specify options like BilinearForm (it is not the only one available, find a comprehensive list in the help message, in Octave), in case you wrote just part of the problem in your last lines.

What now?

A lot of commitment was devoted to this function. This is not due to intrinsic difficulties: a sketch of the function's code has been around for a while and the current implementation has not consistently slid away from it. The goal was to obtain a robust piece of code, since it will be the cornerstone of a new paradigm in fem-fenics usage. At least each and every example provided with the package needs to be modified to take advantage of this change, and this will be my next task.

by Eugenio Gianniti ( at May 23, 2014 02:05 PM

May 19, 2014

Eugenio Gianniti

My first function - Follow up

As said in my previous post, I have been working on extending the implementation of interpolate to allow for an Expression as input. Currently it can also be used as in the Python dolfin interface, see here. Let's see how to use this new function in fem-fenics.

The Poisson equation

This example can be found in the FEniCS Book, it is the very first. The problem at hand is the Poisson equation with Dirichlet boundary conditions:
- Δu = f in Ω
u = u0 on ∂Ω
We will solve this problem on the unit square, with f constant and equal to -6 and u0 = 1 + x2 + 2y2. It can be verified that the exact solution is uex = 1 + x2 + 2y2. With the following ufl file:
element = FiniteElement("Lagrange", triangle, 1)

u = TrialFunction(element)
v = TestFunction(element)
f = Coefficient(element)

a = inner(grad(u), grad(v))*dx
L = f*v*dx

and Octave script:

pkg load fem-fenics msh
import_ufl_Problem ('Poisson')

# Create mesh and define function space
x = y = linspace (0, 1, 20);
mesh = Mesh(msh2m_structured_mesh (x, y, 1, 1:4));

V = FunctionSpace('Poisson', mesh);

func = @(x,y) 1.0 + x^2 + 2*y^2;

# Define boundary condition
bc = DirichletBC(V, func, 1:4);

f = Constant ('f', -6.0);

# Define exact solution
u_e = Expression ('u_ex', func);

a = BilinearForm ('Poisson', V, V);
L = LinearForm ('Poisson', V, f);

# Compute solution
[A, b] = assemble_system (a, L, bc);
sol = A \ b;
u = Function ('u', V, sol);

# Save solution
save (u, 'poisson');

# Interpolate and save the exact solution
u_e_int = interpolate (u_e, V);
save (u_e_int, 'exact');

it is possible to compute the numerical solution, interpolate the analytical one on the same function space and then compare them. Using a visualisation tool like Paraview, one can verify that the computed solution and the interpolation of the exact one are practically the same. This is due to the fact that the Finite Elements Method with triangle elements on a rectangular domain can exactly represent a second order polynomial, as the solution of the problem at hand.

Here you can see a good solution poorly post-processed in Paraview to the Poisson problem solved in the example.

by Eugenio Gianniti ( at May 19, 2014 07:27 PM

Eduardo Fernández

The starting line.

As code period is starting today, I want to write a brief timeline for the first period of the GSOC here:

      • 19 May-20 June: Implement ilu related functions (,,  and merge them together with ilu.m script 
      • 20-25 June: Automated test writing and documentation. Integration to mainstream octave code should be achieved here. 
      • 27 June: (Millstone 1) ilu.m is fully functional and integrated with Octave core.

      Taking the idea from Kai's last year blog, I will keep track of what is already done with the following figure.

      Regarding repository setup, Kai helped me to configure a subrepository using bitbucket service. At present, it only contains an outdated Octave development version just to make sure things work.  For cloning:

      hg clone

      However, I would not need to use this subrepo until the final integration of my code into Octave. For development purposes I have set another repository for daily work, as I am working with .oct files that compile standalone. Here is the repo you should check for my updated work.

      hg clone

      See you next week!

      by Eduardo ( at May 19, 2014 10:22 AM

      May 17, 2014

      Eugenio Gianniti

      My first function

      Lately I have coded my first function for fem-fenics: it is interpolate, which wraps the homonymous method of the dolfin::Function class. This allows to interpolate a FunctionG, on the FunctionSpace of Function F, even if they are not defined on the same mesh, with a call like this:
      res = interpolate (F, G)

      I am working on extending it to allow for an Expression as input. With this function it is possible to make a quantitative comparison between the results of different discretisation approaches or to check the accuracy of a method, comparing the computed solution and an analytically obtained one.

      The implementation

      I provide here a comprehensive overview of the code for interpolate. First of all, the number of input arguments is obtained and an octave_value is declared, in order to hold the output. Then there is a check ensuring that exactly two arguments are provided and no more than one output value is asked for.

      After verifying these preliminary conditions, there are some instructions checking that the function type is loaded and, if necessary, registering it. This way Octave is aware of it and can store it in an octave_value.

      Eventually, the real computation is performed. After checking that the inputs are of the function type, with a static_cast the actual objects are extracted from the arguments:

      const function & u0 = static_cast<const function&> (args(0).get_rep ());
      const function & u1 = static_cast<const function&> (args(1).get_rep ());

      Here comes the tricky part. The classes in fem-fenics are designed to hold constant dolfin objects, but dolfin::Function::interpolate is not a constant method. In order to be able to call it, a local dolfin::Function is constructed, used to perform the interpolation, then fed to the function constructor and assigned to the return value:

      boost::shared_ptr<dolfin::Function> output (new dolfin::Function (u0.get_fun ()));
      const dolfin::Function & input = u1.get_fun ();

      output->interpolate (input);
      std::string name = u1.get_str ();

      retval = new function (name, output);

      by Eugenio Gianniti ( at May 17, 2014 05:22 PM

      May 05, 2014


      Octave in ESA Summer of Code

      Octave has been selected as a mentor organization for the European Space Agency‘s Summer of Code in Space!

      This is our third year in SOCIS. See Roberto Porcù’s blog for the work from last year.

      If you are an eligible student and interested in applying, check out our Ideas page. May 15 is the student application deadline.

      by Nir at May 05, 2014 12:16 AM

      April 27, 2014

      Eugenio Gianniti

      My project

      In this post I identify my project's goals, as already published on Melange.

      list of tasks

      1. reduce copies when passing matrices from Octave to dolfin;
      2. avoid useless compilations of .oct files when they are not needed;
      3. avoid using separate .ufl files, introducing the possibility to write UFL code in .m files;
      4. implement in fem-fenics further FEniCS functionalities, preferably according to the FEniCS developers' directions;
      5. improve the build and distribution system, so that end users can enjoy full functionality right away after installing from Forge.


      I will address point 3 implementing an .m function which accepts strings as arguments and writes them to a file. There should be two keywords, such as start and end, to identify where the UFL input begins and finishes. After writing this code, it will be compiled when needed. This way UFL instructions could be written directly in .m files in the following manner:

      ufl start filename
      ufl <first line>
      ufl <second line>
      ufl <...>
      ufl <last line>
      ufl end

      To address point 5, instead, I will add instructions to PKG_ADD to automatically find out, through pkg-config, the proper flags to allow for the just in time compilation of .oct files. I will also add instructions to PKG_DEL to restore the environment at its previous state when the package is unloaded. This would allow end users to use the package without taking care of the problems reported here.

      tentative agenda

      • 22 April - 18 May: Study of the documentation and interaction with my mentor and the Octave community to understand thoroughly the code base and the contribution expected from my project
      • 19 May - 22 June: First phase of the project. I will implement bindings for the UFL language in Octave and adapt accordingly the provided examples. I will also work on the build and distribution system to allow for feedback on it from the community. In the end, I will commit a first set of new functions
      • 23 June - 27 June: Period for the submission of the mid-term review, I will double check the functionalities already implemented and improve their documentation
      • 28 June - 10 August: Second phase of the project. I will improve the package performance, both reducing copies of matrices between Octave and dolfin and implementing checks to avoid useless just in time compilations. Furthermore, I will add a second, probably larger, set of new functions, as suggested by the community and FEniCS developers. I expect to code some new examples which make use of the freshly introduced capabilities
      • 11 August - 22 August: Week devoted to final minor fixes and to the improvement of the documentation

      by Eugenio Gianniti ( at April 27, 2014 08:45 PM

      Introducing my project

      My name is Eugenio, a student in Mathematical Engineering at Politecnico di Milano. This summer I will be working with GNU Octave to continue the implementation of fem-fenics, an Octave Forge package started in last year Google Summer of Code. It is intended as a wrapper of FEniCS, a general purpose finite elements library, and its goal is to provide such numerical methods in the familiar interface offered by Octave.

      In this blog you will find up-to-date information about the state of my contribution.

      by Eugenio Gianniti ( at April 27, 2014 08:14 PM

      April 24, 2014

      Eduardo Fernández

      Introducing myself

      My name is Eduardo (edu159), and that blog has the purpose of tracking the state of my project with Octave during the GSOC2014 program  (if I become selected) .

      You can visit my public profile at the Octave wiki here:

      Feedback would be welcome. Feel free to comment :).


      by Eduardo ( at April 24, 2014 12:12 AM

      April 22, 2014

      Eduardo Fernández

      GSOC acceptance.

      I have to announce that I have been selected for the 2014 GSOC program and I am very happy with that. In a few days I will continue from where I left my project.

      Thanks to all!

      by Eduardo ( at April 22, 2014 01:15 PM

      April 18, 2014

      Wendy Liu

      Showing only additions and deletions in git

      If you've ever been in a situation where you wanted to see only the additions, or only the deletions, in a git diff, then you probably know that it's not as simple as running git diff --diff-filter=A or git diff --diff-filter=D. At least, not if you're looking for added or deleted lines, as opposed to files.

      From the man page for git diff:

      Select only files that are
      * A Added
      * C Copied
      * D Deleted
      * M Modified
      * [omitted]

      That's pretty cool, but this only works on files. So this would let us see only new files, or only deleted files. But what about lines? What if you only want to see the lines that have been added (or deleted) in one file?

      The solution

      Unfortunately, git does not appear to have an option for this built-in. On the bright side, if you run a *NIX system, you can do this:

      git diff | grep ^+

      This gets you all the additions. To see all the deletions:

      git diff | grep ^-

      Note that the output of the first command will also give you something like

      +++ b/filename

      for each affected file. If this is a problem, you can alter the regex to account for this (or just delete it manually).

      You can of course pass other parameters to git diff, such as the specific commit or commit range you want to see the diff for, or the filename(s).

      The use case

      So why would anyone ever need to do something like this?

      If you've ever been to a Model United Nations conference, you probably know that in exchange for paying a nominal registration fee and giving up your weekend to try to convince someone (who is not actually a representative of the United States) that you (someone who is not actually a representative of Iran) have absolutely no intention of producing nuclear weapons, definitely not, let's go out for a drink and I'm sure we can clear up this whole misunderstanding, you get a nice shiny badge that says your name, your school, and the country that you are supposed to pretend to represent, which looks something like this:

      expanding acronyms is hard

      For the last year or so, I was part of the organising committee for a Model United Nations conference that is sadly not named BADPUNS. As the USG-IT, one of my responsibilities was to take the registration information that was provided by conference attendees through the website and send it to the person in charge of printing badges. Since the website was completely custom-built and I didn't feel like writing a ton of code to make this into an actual feature (since, theoretically, it would only have to be used once), I just wrote a simple Django management command for exporting the information for the badges as a CSV file.

      It worked fine, except for the fact that it was a week before the conference and some of the attendees still had not filled in their names. But we needed to get at least some of the ~1400 badges printed now, so I sent off what I had, and those badges got printed.

      Two days later, when I exported the new badge list, I used git to figure out the difference between the current badges.csv and the previous one. The difficulty arose from the fact that not only did we have new badges to print (because some attendees didn't fill in their information before), we had badges to get rid of as well (because some attendees had last-minute cancellations). So I needed to use the exported badges.csv file to generate a new CSV containing just the new badges to be printed, as well as one listing the badges to be discarded.

      That's where git and grep came to the rescue. To get rid of the + and - at the beginning of each line, I just used Vim's visual block mode. Much easier than manually checking each name, which is probably what I would have done if I didn't have git in my life.

      There may actually be a way of doing this with just git, but I haven't found it. This might be a pretty hacky solution, but hey, it works.

      Know of a better solution? Do tell! I'm @dellsystem on Twitter.

      by dellsystem (Wendy Liu) at April 18, 2014 04:00 AM

      April 16, 2014

      Jordi Gutiérrez Hermoso

      Python is an excellent lingua franca

      I just spent 5 days at PyCon 2014 here in Montréal (3 days for the actual conference, 2 days sprinting), and wow, what a great conference that was.

      There are many things I want to praise about the whole experience. The venue was great, the organisation was superb, the talks were interesting, the infrastructure was amazing, the atmosphere was friendly… but most of all, I think I want to praise the entire culture of inclusiveness that the Python community is trying to promote.

      It is interesting that the only true common thread at the conference was a programming language (and not even that, sometimes, some of the talks were hardly about the Python programming language at all). Python was originally conceived as a programming language that was meant to be as easy as possible to understand. Whether it has succeeded from a purely language-design point of view is hard to say, and not everything about Python is great. The language has its gotchas here and there, just like any other language. And yet, despite not being a perfect language programming language, it’s able to bring together such a diverse group of individuals together to accomplish common goals.

      Python is an excellent programming lingua franca for everyone, not just for Unix nerds (witness: Windows support is taken seriously) and not just for programming geeks (witness: Software Carpentry). Just take a look at the wide range of topics covered in the talks. General software development, the benefits of software freedom, cryptography and security (lol, heartbleed)…

      Of particular note is that 1/3 of all attendees and speakers were female. Can any other tech conference boast such inclusiveness of the usually tech-neglected half of humankind? Look at all the talks related to gender issues: sexism in rap lyrics via machine learning, being a transgender Python hacker, or how to bring more Python to girls in school.

      Now, to be clear, I don’t think that PyCon has eliminated sexism or that we have “won” this battle. As I overheard someone say, PyCon will not be inclusive enough for women unless the lines for the women’s bathroom are as long as the lines for the men’s. And there are still many issues, such as women still being “invisible” and ignored, or as I overheard someone else say, she actually had to say to some guy to look up from her breasts while she was talking to him. It’s not there all the way yet.

      This just seems like a good start. I hope next year at PyCon 2015, we’ll be able to get 50% women attendees and speakers!

      by Jordi at April 16, 2014 02:14 PM

      March 25, 2014

      Eduardo Fernández

      Project goals

      My intention is upgrading  some functions related with sparse matrices  so they become compliant with Matlab  and implement others that are not present in Octave right now. This is the list plus some comments about how I expect to do things.

      1. Implement  sprand/sprandn with the 4rth parameter.(link) (Patch accepted)
      2. sprandsym (implement arguments needed to be compliant with Matlab)
      3. lsqr 
      4. minres
      5. ilu (complete last year GSOC project)
      6. ichol (complete last year GSOC project)
       1: I have already submitted a patch but maybe some modifications are needed to improve performance. Anyway I have mailed Rik (the last one that modified the source code of that functions) and has told me that he will give it a look in a couple of weeks.

      2: I have not yet investigated enough to give a road map for implementing that function.

      3 & 4: For implementing that functions I found those links:

      In this website there are codes written in several programming languages that implement the algorithms (the authors are the ones that written the paper that Matlab gives as reference in their documentation for both functions). I have emailed professor Michael Saunders about adapting them into Octave versions and he answered me that I am welcome to do while I respect the license (CPL or BSD licenses). They are meant to be very unrestrictive but I need to get informed about the compatibility with GPL3.

      5: Here comes the big one. That function has a big chunk of options and the last year was almost implemented by Kai Torben as his GSOC project. He interfaced Octave with ITSOL/ZITSOL libraries but in the end there were some issues with that approach:

      • ILUTP algorithm did not work 
      • He had to patch the library to get things work!
      • modified versions of algorithms ("milu" option) were not implemented in the libraries
      • That "ugly" scenario lead to finally not being able to include ITSOL as a dependency with Octave. Bottom line, the integration with the development repository could not be achieved.
      My approach: I will write from scratch all the functions needed as oct-files (ILUTP, ILU0, ILUC and ILUT) for real and complex numbers implementing the algorithms described by Yousef Saad in his book "Iterative methods for sparse linear systems Ed. 2". I can use some of the code that Kai wrote, mostly the tests and the m-file "ilu.m" that glue together all the functions.

      6: That function need less work since it is almost all implemented. Just some complex implementations are missing and the modified version of the algorithms too. Since Kai implemented those functions from scratch there are no dependency problems. That is nice.

      Update 1: Kai has commented me that there are some license issues on the FORTRAN code he used (see here). So ichol related functions need to be implemented from scratch.

      Update 2: Following Kai's recommendations I will focus on ilu/ichol functions for the GSoC period and just in case there is some time left go ahead with the other functions.

      by Eduardo ( at March 25, 2014 11:56 AM

      March 18, 2014

      Eduardo Fernández

      ILU0 implementation.

      .nobrtable br { display: none } tr {text-align: center;} tr.alt td {background-color: #eeeecc; color: black;} tr {text-align: center;} caption {caption-side:bottom;}

      I have implemented the most basic function ilu0 that do the incomplete LU decomposition with 0-fill. I have drawn a table with the execution times of that algorithm using Matlab, my version and using the code that Kai implemented last year using ITSOL. The function right now can be used with real and complex numbers and has the milu='row' option implemented.

      The table shows for a NxN sparse matrix with a number of non-zero elements NNZ, the time  of execution (tic - toc was used).

      50 - 6830.000055 s0.000065 s0.00085 s
      400 - 724350.027 s0.024 s0.04 s
      2000 - 18055713.35 s3.25 s4.88 s
      5000 - 648283914.2 s14.5 s22.75 s

      It can be seen that the implementation using ITSOL is the slowest. Maybe just because the overhead of copying and translating data back and forth between Octave and ITSOL. Between my version and Matlab's there is almost no difference.

      Here you can download the code ( It does not have any test nor documentation written yet.

      by Eduardo ( at March 18, 2014 04:13 AM

      March 06, 2014


      Apply to work on Octave for GSoC!

      Our Google Summer of Code project ideas got some promotion at Universiteit Gent (Belgium) this week.

      Student applications to Google open on the 10th. The 21st is the application deadline. If you’re interested in applying to work on Octave, see here first.

      by Nir at March 06, 2014 09:04 PM

      February 26, 2014


      GSoC mentoring

      Octave has been accepted into Google Summer of Code, for our first time as an independent organization! Student applications are due March 21, and decisions about accepting students will be made in early April.

      To make GSoC a success, we need not only strong student programmers but also committed mentors to supervise the students and work with them to have contributions be useful for Octave. This can be tough to achieve because mentors are unpaid (though Google may send you a nifty T-shirt).

      Each project should have a primary mentor and a backup mentor. In my experience, primary mentors should plan to devote 5-10 hours per week to the project, including speaking with the students and reviewing their code. Backup mentors have a smaller time commitment but still should keep up with project progress and be available to step in when the primary mentor is otherwise occupied.

      If you’d like to be a project mentor:

      1) Go to the wiki projects page and put your name next to projects you’d be interested in mentoring. Feel free to improve the project description at the same time.

      2) Sign up as a mentor for Octave on the GSoC Melange site.

      3) Start interacting with student applicants who have similar topics of interest.

      by Nir at February 26, 2014 07:36 PM

      February 01, 2014

      Jordi Gutiérrez Hermoso

      Poutine bought with bitcoins

      I am cautiously hopeful for bitcoin. I just ate pizza and poutine with a group of friends, paying at the restaurant with bitcoins!

      Acquiring the bitcoins

      I am not a speculator. I am not an investor. I am not a miner. I am not trying to get rich nor make any money whatsoever by manipulating bitcoins. I just want to be able to earn and spend money on the internet without needing the permission of a bank, or the permission of Paypal, or making it anyone’s business but mine and the person I’m actually doing business with.

      I acquired some bitcoins about a year ago, at the time it was about 1.8 bitcoins worth around 30 USD. I did not buy them. I did not invest on bitcoins. I did not mine them. I received them as a payment for tutoring mathematics in an IRC channel for an hour and a half. I earned these bitcoins through my labour, just like I would earn any other currency.

      At the time there was not much I could do with bitcoins. Although I freely encouraged and accepted the payment in bitcoins, I did it more with amusement than conviction. I am not a criminal, but since at the time Silk Road still had a sort of forbidden underground allure, my client suggested that I could use the bitcoins to buy illegal drugs on the black market. I had no intention to do so, but I decided to keep the bitcoins around, as a cautious hope I could use them for something else.

      First purchases

      I kept the bitcoins for quite some time, since I could not find anything to spend them on. The first thing I could find that seemed interesting was a 4chan “account” (in reality, just an exemption from its captcha). Despite the website’s bad reputation as being the cesspool of the internet, I think that there is value in its pro-anonymity ethos, something that is slowly being eroded away in today’s online world. This was also my first test case for the possibility of anonymous online currency. In keeping with this ethos, I made the payment, and when there was a slight hiccup with the transaction, used a throw-away email address to resolve the issue. No chargebacks, but people are still people and are still mostly nice. I thus obtained my 4chan captcha exemption. I have not really used it since then, but I am satisfied knowing that I have made a small contribution towards promoting anonymity.

      I kept my eye out for news stories of more opportunities to spend bitcoins on. The currency seemed to be slowly gaining more adoption, especially for non-material goods and services. My next purchase was one month of reddit gold as a gift for a particularly witty commentator. These two purchases together had already given a significant blow to my bitcoin balance, but I was not duly concerned. After all, this was just pocket change I acquired for 90 minutes of a hobby task.

      Then, suddenly, over a couple of weeks the bitcoin price exploded from 20 USD per bitcoin to over 1000 USD per bitcoin. I didn’t exactly become a millionaire, but my paltry fraction of a bitcoin now had the power to buy more things.

      Bitcoin starting to look like a possibility

      I made a few more purchases. A PDF copy of Julian Assange’s et al Cypherpunks book. A month of VPN access. Sent bitcoins to a kind stranger on the internet in exchange for digital albums from indie band Nectarphonic. When started selling gift cards to in exchange for bitcoins, I obtained my first physical product for bitcoins: a boxed set of Susan Collin’s The Underland Chronicles.

      This really had just been my time getting used to bitcoin and how it works. The fluctuations in price, the cryptography behind it, the UI presented first by the “official” bitcoinqt client and understanding more how the alternative bitcoin client Electrum works. The security model, what it means to “own” bitcoins. Passwords and private keys. The workings of the blockchain. Using Tor for enhanced anonymity of transactions.

      I finally got around to reading Satoshi Nakamoto’s whitepaper. Wow. I don’t know if this person or group is a genius or just a child of the times, but Nakamoto seems to have solved a cryptography protocol problem that nobody had solved before. Nakamoto didn’t invent the idea of cryptocurrencies, but merely built upon the ideas of others in order to build a decentralised system. I was duly impressed.

      Then two days ago I saw that the first restaurant in Montréal was proclaiming to accept bitcoins. I could now buy Québec’s signature dish, poutine, with bitcoins! Excited, I started making plans.

      Food for bitcoins

      I eagerly announced yesterday on IRC that I was planning to go to this restaurant to try out my bitcoins. Another chatter local to Montréal decided to tag along for fun. We made plans to go that very evening for a night of socialising, three couples. I eagerly explained to everyone my excitement over finally being able to pay someone face-to-face with bitcoins.

      My fiancée and I got to Montréal Poutine in the Old Port a few minutes earlier than everyone else. It was a small location, hardly more than a greasy spoon, but par for the course for a poutine joint. There were signs all over the establishment announcing the possibility of paying with bitcoins and a wifi password on their chalkboards.

      We eyed the menu, and I nervously made a few preliminary checks to ensure the transaction would go smoothly. Due to one of my many quirks, I do not own a pocket computer (“smartphones”, neither smart and hardly phones), so I was prepared to pay with my laptop and a webcam for scanning a QR code. I ensured that the internet connection was stable and that I could scan a QR code. As instructed in (or in, because this is .ca), I proudly announced to my server that I was going to pay with bitcoins. “Oh, wow, it’s my first time,” she said in French. “Mine too,” I replied with a smile.

      Our company arrived, the two other couples. We ordered various kinds of poutine, pizza, avocado fries, beer and mineral water. We chatted about many things, and soon predictable discussions about bitcoins ensued. Their method of operation, their security, their volatility but recent relative stability. Musings if they would ever work or not. The more techy in the bunch were showing signs of optimism, while those who found the idea more foreign were predicting the eventual doom for bitcoins.

      After a very pleasant evening, it was time to pay the bill.

      The moment of truth

      I announced that I had enough bitcoins to pay for all the food, but I asked everyone else to divvy up the drinks between them. I also told them that I had read that tips were not accepted in bitcoins yet, so that would have to be paid elsehow.

      I readied my laptop and the webcam, to the growing amusement of my companions. The server came with one bill for food and another for drinks. I reminded her that I was going to pay with bitcoins, but I wasn’t quite sure what to expect. I had seen videos online of handheld machines that displayed a QR code with the bitcoin address to pay to, and that was my guess for what I would see. Instead, she pointed me to a static QR code pasted next to the cash register. Was this the bitcoin address to pay to? I walked with my baroque laptop-and-webcam rig over to the cash register, in good fun as I heard my giggling friends behind me.

      I used Electrum to scan the QR code, looking for the address. Instead of a bitcoin address, this was a Bitpay URL. I wasn’t quite sure what to do with it, and I floundered for a few seconds. I opened a generic QR code scanner to get the URL. One of the guys with us helpfully walked over to help me scan the QR code, but I had already managed to get the URL loaded in Firefox by this time. The server was re-reading the instructions next to the QR code on how to pay.

      At the Bitpay URL, there were two fields to fill out: an order number and the amount in CAD to pay. The server read the instructions again and said to leave the order number blank. I filled in the amount with taxes, showed it to her, and we agreed it was correct. I clicked enter. A bitcoin address showed up. I went back to Electrum to send money to that address. I stumbled when typing my wallet password. The server, getting the hang of the transaction, said bemusedly, “I cannot help you with that.” Finally I got it, and the familiar “this invoice has been paid” message showed up on the Bitpay website.

      A few seconds passed, and the restaurant staff confirmed on another computer that they had received the payment. I showed everyone at the table the paid invoice on my laptop screen. A few other customers in the restaurant were showing some interest. I announced to everyone, “it’s ok, bitcoin has worked, I made the payment!”

      Everyone relaxed and proceeded to tackle the more familiar problem of paying for the drinks with more conventional currency.

      Future of bitcoins

      I don’t know if bitcoin is going to crash or not. I don’t understand enough economics to comment on what the value of bitcoins is, or if it really is different from the value that we give to any other currency. I have a rough understanding of the cryptography behind it and what makes the whole thing work. I know people are working on making bitcoins more accessible to people less enthusiastic than me, and I very much wish they succeed.

      All I know is that so far bitcoins have worked for me. I look forward to getting better acquainted with this new way to conduct business. I wish good fortune to everyone else who is also trying to build a new currency for an internet-powered age.

      by Jordi at February 01, 2014 09:51 PM

      January 08, 2014

      Jordi Gutiérrez Hermoso

      Polling for OctConf 2014

      So, we want to know if this is going to work out.

      #yop-poll-container-2_yp53a1e08df377f { width:200px; background:#555; padding:10px; color:#fff; overflow:hidden; font-size:12px; } #yop-poll-name-2_yp53a1e08df377f { font-size:14px; font-weight:bold; } #yop-poll-question-2_yp53a1e08df377f { font-size:14px; margin:5px 0px; } #yop-poll-answers-2_yp53a1e08df377f { } #yop-poll-answers-2_yp53a1e08df377f ul { list-style: none outside none; margin: 0; padding: 0; } #yop-poll-answers-2_yp53a1e08df377f ul li { font-style:normal; margin:0px 0px 10px 0px; padding:0px; font-size:12px; } #yop-poll-answers-2_yp53a1e08df377f ul li input { margin:0px; float:none; } #yop-poll-answers-2_yp53a1e08df377f ul li label { margin:0px; font-style:normal; font-weight:normal; font-size:12px; float:none; } .yop-poll-results-2_yp53a1e08df377f { font-size: 12px; font-style: italic; font-weight: normal; margin-left: 15px; } #yop-poll-custom-2_yp53a1e08df377f { } #yop-poll-custom-2_yp53a1e08df377f ul { list-style: none outside none; margin: 0; padding: 0; } #yop-poll-custom-2_yp53a1e08df377f ul li { padding:0px; margin:0px; font-size:14px; } #yop-poll-container-2_yp53a1e08df377f input[type='text'] { margin:0px 0px 5px 0px; padding:2%; width:96%; text-indent:2%; font-size:12px; } #yop-poll-captcha-input-div-2_yp53a1e08df377f { margin-top:5px; } #yop-poll-captcha-helpers-div-2_yp53a1e08df377f { width:30px; float:left; margin-left:5px; height:0px; } #yop-poll-captcha-helpers-div-2_yp53a1e08df377f img { margin-bottom:2px; } #yop-poll-captcha-image-div-2_yp53a1e08df377f { margin-bottom:5px; } #yop_poll_captcha_image_2_yp53a1e08df377f { float:left; } .yop_poll_clear { clear:both; } #yop-poll-vote-2_yp53a1e08df377f { } .yop-poll-results-bar-2_yp53a1e08df377f { background:#f5f5f5; height:10px; } .yop-poll-results-bar-2_yp53a1e08df377f div { background:#333333; height:10px; } #yop-poll-vote-2_yp53a1e08df377f div#yop-poll-vote-2_yp53a1e08df377f button { float:left; } #yop-poll-vote-2_yp53a1e08df377f div#yop-poll-results-2_yp53a1e08df377f { float: right; margin-bottom: 20px; margin-top: -20px; width: auto; } #yop-poll-vote-2_yp53a1e08df377f div#yop-poll-results-2_yp53a1e08df377f a { color:#fff; text-decoration:underline; font-size:12px;} #yop-poll-vote-2_yp53a1e08df377f div#yop-poll-back-2_yp53a1e08df377f a { color:#fff; text-decoration:underline; font-size:12px;} #yop-poll-vote-2_yp53a1e08df377f div { float:left; width:100%; } #yop-poll-container-error-2_yp53a1e08df377f { font-size:12px; font-style:italic; color:red; text-transform:lowercase; } #yop-poll-container-success-2_yp53a1e08df377f { font-size:12px; font-style:italic; color:green; }
      OctConf 2014
      Can you come to Montréal, Canada, in the summer of 2014 for OctConf?

      by Jordi at January 08, 2014 02:59 PM

      January 05, 2014

      Roberto Porcù

      December 15, 2013

      Roberto Porcù

      RATTLE algorithm


      Constrained mechanical systems form an important class of differential equations on manifolds. For all the theory that I present here and I've used for the implementation I refer to "Geometric Numerical Integration" by Hairer, Lubich and Wanner.

      Consider a mechanical system described by position coordinates \(q_1,\dots q_d\) and suppose that the motion is constrained to satisfy \(g(\mathbf q) = \mathbf 0\) where \(g:\mathbb R^d \rightarrow \mathbb R^m \), with \(m \lt d \). So that, the equations of motion governing the system become: $$ \left\{ \begin{array}{l} \dot{\mathbf q} = \frac{\partial H}{\partial \mathbf p} \\ \dot{\mathbf p} = -\frac{\partial H}{\partial \mathbf q} - G(q)^T\lambda \\ g(\mathbf q) = \mathbf 0 \end{array}\right. $$ where \(G(\mathbf q) = \dfrac{\partial g(\mathbf q)}{\partial \mathbf q} \).

      Symplectic Euler method can be extended to constrained systems but we focus on SHAKE and RATTLE algorithms. SHAKE is a 2-steps algorithm so that, since I'm implementing only 1-step algorithms and the overall structure of solvers and integrators is made for 1-step solvers, I implemented just RATTLE algorithm.


      The RATTLE algorithm implemented works with any general Hamiltonian \(H(\mathbf q,\mathbf p \) and is defined as follows: $$ \left\{\begin{array}{l} \mathbf p_{n+\frac{1}{2}} = \mathbf p_n -\frac{h}{2}\big(H_{\mathbf q}(\mathbf q_n,\mathbf p_{n+\frac{1}{2}}) + G(\mathbf q_n)^T \mathbf {\lambda}_n \big) \\ \mathbf q_{n+1} = \mathbf q_n +\frac{h}{2} \big( H_{\mathbf p}(\mathbf q_n,\mathbf p_{n+\frac{1}{2}}) + H_{\mathbf p}(\mathbf q_{n+1},\mathbf p_{n+\frac{1}{2}}) \big) \\ g(\mathbf q_{n+1}) = \mathbf 0 \\ \mathbf p_{n+1} = \mathbf p_{n+\frac{1}{2}} -\frac{h}{2}\big(H_{\mathbf q}(\mathbf q_{n+1},\mathbf p_{n+\frac{1}{2}}) + G(\mathbf q_{n+1})^T \mathbf{\mu}_n \big) \\ G(\mathbf q_{n+1}) H_{\mathbf p}(\mathbf q_{n+1},\mathbf p_{n+1}) = \mathbf 0 \end{array}\right. $$ where \( h=\Delta t=t_{k+1}-t_k\) and \(\mathbf{\mu}_n \),\( \mathbf{\mu}_n \) are Lagrangian multipliers nedded to impose the constraints.

      It can be demonstrated that this numerical method is symmetric, symplectic and convergent of order 2.

      The following code represent it's implementation:

      function [t_next,x_next,err]=rattle(f,t,x,dt,options)
      H = odeget(options,'HamiltonianHessFcn',[],'fast');
      GG = odeget(options,'ConstraintHessFcn',[],'fast');
      if( ~isempty(H) && ~isempty(GG) )
      fsolve_opts = optimset('Jacobian','on');
      fsolve_opts = optimset('Jacobian','off');
      g = odeget(options,'ConstraintFcn',[],'fast');
      G = odeget(options,'ConstraintGradFcn',[],'fast');
      c_nb = odeget(options,'ConstraintsNb',[],'fast');

      dim = length(x)/2;
      q0 = x(1:dim);
      p0 = x(dim+1:end);

      RATTLE = @(y)constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt);
      y0 = [q0;p0;p0;zeros(2*c_nb,1)];
      y0 = fsolve(RATTLE,y0,fsolve_opts);
      t_next = t+dt;
      x_next = [y0(1:dim);y0(2*dim+1:3*dim)];

      dt = dt/2;
      RATTLE = @(y)constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt);
      y0 = [q0;p0;p0;zeros(2*c_nb,1)];
      y0 = fsolve(RATTLE,y0,fsolve_opts);

      q0 = y0(1:dim);
      p0 = y0(2*dim+1:3*dim);
      t = t+dt;

      RATTLE = @(y)constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt);
      y0 = [q0;p0;p0;zeros(2*c_nb,1)];
      y0 = fsolve(RATTLE,y0,fsolve_opts);

      x_est = [y0(1:dim);y0(2*dim+1:3*dim)];
      err = norm(x_est-x_next,2);

      function [F,J] = constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt)
      F = zeros(3*dim+2*c_nb,1);
      F(1:dim) = y(1:dim) - q0 - (dt/2).*(f(t,[q0; ...
      y(dim+1:2*dim)])(1:dim) + f(t+dt,y(1:2*dim))(1:dim));
      F(dim+1:2*dim) = y(dim+1:2*dim) - p0 - (dt/2).*(f(t,[q0; ...
      y(dim+1:2*dim)])(dim+1:end) - G(q0)'*y(3*dim+1:3*dim+c_nb));
      F(2*dim+1:3*dim) = y(2*dim+1:3*dim) - y(dim+1:2*dim) - ...
      (dt/2)*(f(t+dt,y(1:2*dim))(dim+1:end) - ...
      F(3*dim+1:3*dim+c_nb) = g(y(1:dim));
      F(3*dim+c_nb+1:end) = G(y(1:dim))*(f(t+dt,[y(1:dim); ...

      if( nargout==2 )
      J = zeros(3*dim+2*c_nb,3*dim+2*c_nb);
      J(1:dim,1:dim) = eye(dim) - ...
      J(1:dim,dim+1:2*dim) = -(dt/2)*(H(t,[q0; ...
      y(dim+1:2*dim)])(dim+1:end,dim+1:end) + ...
      J(dim+1:2*dim,dim+1:2*dim) = eye(dim) + ...
      J(dim+1:2*dim,3*dim+1:3*dim+c_nb) = (dt/2)*G(q0)';
      J(2*dim+1:3*dim,1:dim) = (dt/2)*(H(t+dt, ...
      for k = 1:1:c_nb
      J(2*dim+1:3*dim,1:dim) = J(2*dim+1:3*dim,1:dim) - ...
      J(2*dim+1:3*dim,dim+1:2*dim) = -eye(dim) + ...
      J(2*dim+1:3*dim,2*dim+1:3*dim) = eye(dim) + ...
      J(2*dim+1:3*dim,3*dim+c_nb+1:end) = (dt/2)*G(y(1:dim))';
      J(3*dim+1:3*dim+c_nb,1:dim) = G(y(1:dim));
      J(3*dim+c_nb+1:end,1:dim) = G(y(1:dim))* ...
      for k = 1:1:c_nb
      J(3*dim+c_nb+k,1:dim) = J(3*dim+c_nb+k,1:dim) + ...
      ((GG(y(1:dim))(:,:,k))*(f(t+dt,[y(1:dim); ...
      J(3*dim+c_nb+1:end,2*dim+1:3*dim) = G(y(1:dim))* ...
      (H(t+dt,[y(1:dim);y(2*dim+1:3*dim)]) ...
      It works with any number of constraint, unless this is equal or greater to system dimension. As usual, all the source code is available at my public repository octave-odepkg.

      by Roberto Porcù ( at December 15, 2013 03:47 PM

      December 09, 2013

      Roberto Porcù

      Steppers and Integrate functions

      In this first post I want to explain the organization of the code that I'm going to implement. The main idea is to have a structured organization and to subdivide the code according to the most important operations which must be executed. So that should be easier to optimize the bottlenecks and to extend the code with new functionalities.

      We will have two main actors: the steppers and the integrate functions.


      A stepper will be a function and will represent the numerical method used for the integration. Its job is to execute just one integration step and its signature will be:

      [x_next,err] = stepper(f,x,t,dt)

      x_next is the solution at the next step, err is an estimation of the error (obtainable for example with Richardson Extrapolation), f is a function handle representing the equations to be integrated, x is the solution at the current time t and finally dt is the time step.

      Inside this function I'll check if err is really requested by means of nargout keyword. The estimation of the error will be useful to determine the optimal dt in adaptive integrators.

      To set the parameters for both the numerical method and the solver I'll use odeset and odeget which are not embedded into Octave but are functions of Odepkg. My intent is to edit these functions in order to make them suitable for my parameters setting.

      Integrate Functions

      An integrate function will be the function executing the algorithm of integration on more steps. We will have different integrate functions depending on how we want to obtain the solution. For example we can have:

      • integrate_const(stepper,f,x0,t0,t1,dt) integrating from t0 to t<=t1 with a fixed dt;
      • integrate_n_steps(stepper,f,x0,t0,dt,n) integrating from t0 for n integration steps with fixed dt;
      • integrate_adaptive(stepper,p,f,x0,t0,t1,dt) integrating from t0 to t1 with an adaptive timestep, where p is the order of the stepper.

      A simple example: Forward Euler

      In this part I'll show you an example of two simple steppers (fe_richardson and fe_heun). Both the steppers make use of the Forward Euler method to find the new solution: $$x_{k+1} = x_{k} + h f(t_{k},x_{k})\ .$$ They differ in the way the error is estimated: fe_richardson makes use of Richardson Extrapolation while fe_heun makes use of Heun-Euler method.

      This is the implementation of the fe_richardson stepper:

      function [t_next,x_next,err] = fwe_richardson(f,t,x,dt,options)
      x_next = x + dt.*f(t,x);
      t_next = t+dt;
      if(nargout == 3)
      x1 = x + (.5*dt).*f(t,x);
      x_est = x1 + (.5*dt).*f(t+ .5*dt,x1);
      err = norm(x_next - x_est,2);

      The Heun-Euler method combines the Heun method, which is of order 2, with the Euler method, which is of order 1. This combination allows to get an estimation of the local error. We can write the Heun method as $$\begin{aligned} \tilde{x}_{k+1} &= x_{k} + h f(t,x_{k}) \\ x_{k+1}^{*} &= x_{k} + \frac{h}{2} (f(t,x_{k}) + f(t+h,\tilde{x}_{k+1})) \end{aligned}$$ So that the error estimation is given by: $$ e_{k+1} = \tilde{x}_{k+1} - x^*_{k+1}=\frac{h}{2}(f(t+h,\tilde{x}_{k+1}) - f(t,x_{k}))$$ This is the implementation of the fe_heun method:

      function [t_next,x_next,err]=fwe_heun(f,t,x,dt,options)
      x_next = x + dt.*f(t,x);
      t_next = t+dt;
      if(nargout == 3)
      err = .5 * dt * norm(f(t+dt,x_next) - f(t,x),2);

      The adaptive integrator

      Finally I'll describe how integrate_adaptive is implemented. As i said this function does the integration on the total interval [t0,t1] adapting the timestep at each iteration, by means of the error estimation returned by the stepper. This function takes as input:

      • stepper: the stepper;
      • p: the order of the stepper;
      • f: the function to be integrated;
      • x0: the initial condition;
      • t0, t1: the extremes of the time interval;
      • dt: the first timestep.
      While the actual time is less or equal to the final instant, this function recursively calls the stepper and then updates the solution, the time and the next timestep. Finally, since the very last instant may be greater than t1, the function substitutes the last element of the solution with a linear interpolation calculated in t1 between the last two elements of the solution.

      function [t,x] = integrate_adaptive(stepper,order, ...
      t = tspan(1);
      x = x0(:);
      dt = odeget(options,'InitialStep',[],'fast');
      tau = 1.e-6;
      counter = 2;
      while( t(end) (tspan(end)-1.e-13) )
      [s,y,err] = stepper(func,t(end),x(:,end),dt,options);
      if( s(end) > (tspan(counter)+1.e-13) )
      z = [t(end);s];
      u = [x(:,end),y];
      i = 1;
      while( i=length(z) )
      if( abs(z(i)-tspan(counter)) 1.e-13 )
      x = [x,y];
      t = [t;s];
      i_old = i;
      i = length(z)+1;
      elseif( z(i) > (tspan(counter)+1.e-13) )
      x = [x(:,1:end-1),u(:,1:i-1),u(:,i-1)+ ...
      ((tspan(counter)-z(i-1))/(z(i)-z(i-1)))* ...
      t = [t(1:end-1);z(1:i-1);tspan(counter);z(i:end)];
      i_old = i;
      i = length(z)+1;
      x = [x,y];
      t = [t;s];
      err += eps;
      dt = dt.*(tau/abs(err))^(1.0/(order+1));
      if( counter > max(size(tspan)) )
      k = (length(z)-i_old)+1;
      x = x(:,1:end-k);
      t = t(1:end-k);
      x = x';

      The method used to calculate the next timestep is taken from the book of Shampine, Gladwell and Thomson. tau is a parameter which now is fixed but in the future will be setted by means of odeset and odeget functions.

      How it will be at the end

      At the end, functions like ode45 will be just aliases. For example, we define a function called odefwe which does the integration by means of Forward Euler method, adapting the timestep with Richardson Extrapolation. The call to this function will just hide a call to integrate_adaptive function in this way:

      odefwe(f,x0,t0,t1,dt) = ...
      In the same way we'll define a function called ode12 which does the integration in an adaptive way using the Heun-Euler method. So that, now we will have:
      ode12(f,x0,t0,t1,dt) = integrate_adaptive(fe_heun,1,f,x0,t0,t1,dt)

      by Roberto Porcù ( at December 09, 2013 08:37 AM

      Odeset & odeget

      Odeset & odeget

      Odeset and odeget functions allow to build a valid ODE options structure. They are already available in Octave odepkg but they are not perfectly compatible with MATLAB odeset and odeget functions. Furthermore, for geometric integrators like Spectral Variational Integrators, I will need new options into the ODE options structure, which now are not admitted in Octave odepkg.

      So that I have written my own versions of odeget.m, odeset.m and the new function ode_struct_value_check.m in order to have full compatibility with MATLAB odeset and odeget (their same behaviour on the basis of their official documentation) and also the possibility to add new field names which i will need for future solvers of this project.

      The new fields are the union of MATLAB ODE options and Octave ODE options, plus my new options (default values are in square brackets):

      '             AbsTol:  scalar or vector, >0, [1.e-6]'
      ' BDF: binary, {on, [off]}'
      ' Events: '
      ' Explicit: binary, {yes, no, []}'
      ' InexactSolver: switch, {inexact_newton, []}'
      ' InitialSlope: vector, []'
      ' InitialStep: scalar, >0, []'
      ' Jacobian: matrix or function_handle, []'
      ' JConstant: binary, {on, [off]}'
      ' JPattern: sparse matrix, []'
      ' Mass: matrix or function_handle, []'
      ' MassConstant: binary, {on, [off]}'
      ' MassSingular: switch, {yes, [maybe], no}'
      'MaxNewtonIterations: scalar, integer, >0, [1.e3]'
      ' MaxOrder: switch, {1, 2, 3, 4, [5]}'
      ' MaxStep: scalar, >0, []'
      ' MStateDependence: switch, {none, [weak], strong}'
      ' MvPattern: sparse matrix, []'
      ' NewtonTol: scalar or vector, >0, []'
      ' NonNegative: vector of integers, []'
      ' NormControl: binary, {on, [off]}'
      ' OutputFcn: function_handle, []'
      ' OutputSave: scalar, integer, >0, []'
      ' OutputSel: scalar or vector, []'
      ' PolynomialDegree: scalar or vector, integer, >0, []'
      ' QuadratureOrder: scalar or vector, integer, >0, []'
      ' Refine: scalar, integer, >0, []'
      ' RelTol: scalar, >0, [1.e-3]'
      ' Stats: binary, {on, [off]}'
      ' TimeStepNumber: scalar, integer, >0, []'
      ' TimeStepSize: scalar, >0, []'
      ' Vectorized: binary, {on, [off]}'

      The usage of odeset will be one of the following:

      opt = odeset()
      opt = odeset(old_ODE_options,new_ODE_options)
      opt = odeset(old_ODE_options,'field1',value1,'field2',value2,...)
      opt = odeset('field1',value1,'field2',value2,'field3',value3,...)

      The usage of odeget will be one of the following:

      option_value = odeget(ODE_structure,'field')
      option_value = odeget(ODE_structure,'field',default_value)
      option_value = odeget(any_struct,'field',default_value,'fast')
      The last usage is needed for MATLAB compatibility and represents a fast access to the given structure with no error checking on options names.

      Fuzzy string comparison

      Users may do mistakes while typing ODE option names or they may want to write everything in low cases for a faster coding. So that my odeset and odeget functions behave in a user-friendly way, calling the fuzzy_compare function which determines the distance of each structure option name from the given word and returns the indices of those options whose distance is under a given tolerance.

      The idea is that word cases are not relevant, if we want an exact matching then the tolerance will be 0, if no index is returned then it will definitely be a typing error, if one index is returned but words don't match exactly a warning will be displayed saying that the program is going on assuming that the user was intending the closest option name, otherwise all the fields whose distance is under the tolerance will be displayed and the user will be asked to insert the name again.

      Signature of this function follows.

      res = fuzzy_compare(string1,string_set,correctness)
      • string1 must be a string and is the option name we're looking for;
      • string_set must be a cell array of strings or a column vector of strings; it represents the set of option names;
      • correctness is the tolerance we want. It is an optional input argument;
      We may set correctness equal to 'exact' or 0 and in this case fuzzy_compare will look only for exact matching; if we set correctness to any positive integer it will be the tolerance of the method (that is the maximum distance, from the given string, accepted). If we don't specify anything for correctness then the tolerance will be calculated as a function of the minimum distance and of the length of the given string: the less the distance, the less the tolerance; the greater the length, the greater the tolerance.
      tolerance = minimus + floor(length(string1)/4)*floor(minimus/3)
      There exist many definitions of distance between strings. I've chosen the Levenshtein distance.

      Levenshtein distance

      The Levenshtein distance is a string metric and is equal to the minimum number of single-characters edit (insertion, deletion and substitution) required to change one word into the other. This is the main algorithm of levenshtein function:

      function [dist,d] = levenshtein(string1,string2)
      m = length(string1);
      n = length(string2);

      d = zeros(m+1,n+1);
      d(2:m+1,1) = [1:1:m];
      d(1,2:n+1) = [1:1:n];

      for j=2:1:n+1
      for k=2:1:m+1

      dist = d(m+1,n+1);

      All the code is available under the terms of GNU General Public License at my_octave-odepkg.

      by Roberto Porcù ( at December 09, 2013 08:32 AM

      Backward Euler with Inexact Newton solver

      Backward Euler

      The backward Euler method is one of the most common methods used to solve ODEs. It is an implicit method of order one but its strenght lies in its A-stability property.

      Given a system of ODEs of the form: $$\frac{d\mathbf{y}}{dt} = \mathbf{f}(t,\mathbf{y}) \\ \mathbf{y}(t_0) = \mathbf{y}_0\ ,$$ the backward Euler method is written as: $$ \mathbf{y}_{k+1} = \mathbf{y}_k + \Delta t\ \mathbf{f}(t_{k+1},\mathbf{y}_{k+1})\ .$$

      If we define u to be the exact solution then we have that: $$ \frac{d\mathbf{u}}{dt} = \mathbf{f}(t,\mathbf{u}) \\ \mathbf{u}(t_k) = \mathbf{y}_k $$ So that $$\mathbf u(t_{k+1}) = \mathbf u(t_k) + \int_{t_k}^{t_{k+1}} \mathbf{f}(t,\mathbf u) dt\ ,$$ which, under some conditions of regularity, can be rewritten as $$\mathbf u(t_{k+1}) = \Delta t\ \mathbf f(t_{k+1},\mathbf u(t_{k+1})) - \frac{{\Delta t}^2}{2} \mathbf f'(\xi)\ .$$

      From these results it can be derived that the local truncation error (LTE) is $$ \mathbf e_{k+1} = \mathbf y_{k+1} - \mathbf u(t_{k+1}) = \mathbf o({\Delta t}^2)\ .$$

      Inexact Newton

      Now we have to solve the non-linear system built up with the backward Euler method. We want to use an inexact Newton solver. The classical Newton method is written as: given a non-linear functions system F(x) = 0 with a given initial condition, solve the linearized system: $$ \mathbf F'(\mathbf x_k)\mathbf s_k + \mathbf F(\mathbf x_k) = \mathbf 0 \\ \mathbf x_{k+1} = \mathbf x_k + \mathbf s_k\ .$$ Go on iterating this method until a given tolerance is reached.

      In many cases, especially when we don't have an explicit expression for the Jacobian or it can't be inverted, we can use an inexact Newton method: $$ || \mathbf F'(\mathbf x_k)\mathbf s_k + \mathbf F(\mathbf x_k) || \leq \eta_k || \mathbf F(\mathbf x_k)||\ ,$$ where \(\eta_k\) is said to be the forcing term. The linearized system can be solved with an iterative method such as GMRES or preconditioned conjugate gradient.

      There are two possible choices for the forcing term:

      1. first choice $$ \eta_k = \frac{|| \mathbf F(\mathbf x_k)-\mathbf F(\mathbf x_{k-1}) -\mathbf F'(\mathbf x_{k-1})\mathbf s_{k-1} ||}{||\mathbf F(\mathbf x_{k-1}) ||}\ ;$$
      2. second choice $$ \eta_k = \gamma \Big(\frac{|| \mathbf F(\mathbf x_k) ||}{|| \mathbf F(\mathbf x_{k-1}) ||}\Big)^{\alpha}\ .$$

      Choosing the forcing term with one of the two previous possibilities, rather than imposing a fixed tolerance, it is possible to avoid the phenomenon of oversolving. Another advantage is that if we use the GMRES as the linear solver it's not necessary to know the Jacobian nor to datermine it, because we can approximate the term $$ \mathbf F'(\mathbf x_{k})\mathbf s_k $$ with a first order finite differences formulae: $$ \mathbf F'(\mathbf x_k)\mathbf s_k \approx \frac{\mathbf F(\mathbf x_k + \delta\mathbf s_k) - \mathbf F(\mathbf x_k)}{\delta} \ .$$

      The signature of inexact_newton is the same of fsolve:


      by Roberto Porcù ( at December 09, 2013 08:28 AM

      Spectral Variational Integrators

      Variational integrators

      The variational integrators are a class of numerical methods for mechanical systems which comes from the discrete formulation of Hamilton's principle of stationary action. They can be applied to ODEs, PDEs and to both conservative and forced systems. In absence of forcing terms these methods preserve momenta related to symmetries of the problem and don't dissipate energy. So that they exhibit long-term stability and good long-term behaviour.

      Considering a configuration manifold V, the discrete Lagrangian is a function from V to the real numbers space, which represents an approximation of the action between two fixed configurations: $$ L_d(\mathbf q_k,\mathbf q_{k+1}) \approx \int_{t_k}^{t_{k+1}} L(\mathbf q,\dot{\mathbf q};t) dt \hspace{0.5cm}\text{with}\hspace{0.4cm}\mathbf q_{k},\mathbf q_{k+1}\hspace{0.3cm}\text{fixed.}$$ From here, applying the Hamilton's principle, we can get the Euler-Lagrange discrete equations: $$D_2L_d(\mathbf q_{k-1},\mathbf q_k) + D_1 L_d(\mathbf q_k,\mathbf q_{k+1}) = 0\ ,$$ and thanks to the discrete Legendre transforms we get the discrete Hamilton's equation of motion: $$\left\{\begin{array}{l} \mathbf p_k = -D_1 L_d(\mathbf q_k,\mathbf q_{k+1}) \\ \mathbf p_{k+1} = D_2 L_d(\mathbf q_k,\mathbf q_{k+1}) \end{array}\right.\ ,$$ so that we pass from a 2-step system of order N to a 1-step system of order 2N. This system gives the updating map: $$ (\mathbf q_k,\mathbf p_k)\rightarrow(\mathbf q_{k+1},\mathbf p_{k+1})\ .$$ For all the theory behind this I refer to "Discrete mechanics and variational integrators" by Marsden and West.

      Spectral variational integrators

      To create a spectral variational integrator I considered a discretization of the configuration manifold on a n-dimensional functional space generated by the orthogonal basis of Legendre polynomials. So that, after rescaling the problem from \([t_k,t_{k+1}]\) (with \(t_{k+1}-t_k=h\)) to \([-1,1]\), we get: $$ \begin{array}{l} \mathbf q_k(z) = \sum_{i=0}^{n-1}\mathbf q_k^i l_i(z)\\ \dot{\mathbf q}_k(z) = \frac{2}{h} \sum_{i=0}^{n-1}\mathbf q_k^i \dot l_i(z) \end{array} \ .$$ Then I approximate the action using the Gaussian quadrature rule using \(m\) internal nodes, so putting all together we have: $$ \int_{t_k}^{t_{k+1}} L(\mathbf q,\dot{\mathbf q};t)dt\hspace{0.5cm} \approx\hspace{0.5cm} \frac{h}{2}\sum_{j=0}^{m-1} \omega_j L\big( \sum_{i=0}^{n-1}\mathbf q_k^i l_i , \frac{2}{h} \sum_{i=0}^{n-1}\mathbf q_k^i \dot l_i \big) $$ Now imposing Hamilton's principle of stationary action under the constraints: $$ \mathbf q_k = \sum_{i=0}^{n-1}\mathbf q_k^i l_i(-1) \hspace{1.5cm} \mathbf q_{k+1} = \sum_{i=0}^{n-1}\mathbf q_k^i l_i(1)\ ,$$ we obtain the system: $$ \left\{ \begin{array}{l} \sum_{j=0}^{m-1}\omega_j \bigg[ \mathbf p_j \dot{l}_s(z_j) - \frac{h}{2} l_s(z_j) \frac{\partial H}{\partial \mathbf q} \bigg ( \sum_{i=0}^{n-1}\mathbf q_k^i l_i(z_j),\mathbf p_j \bigg ) \bigg ] + l_s(-1)\mathbf p_k - l_s(1)\mathbf p_{k+1} = 0 \hspace{1cm} \forall s=0,\dots,n-1\\ \frac{\partial H}{\partial \mathbf p}\bigg (\sum_{i=0}^{n-1}\mathbf q_k^i l_i(z_r),\mathbf p_r\bigg ) -\frac{2}{h}\ \sum_{i=0}^{n-1} \mathbf q_k^i \dot{l}_i(z_r)=0 \hspace{1cm} \forall r=0,\dots,m-1 \\ \sum_{i=0}^{n-1} \mathbf q_k^i l_i(-1) - \mathbf q_k = 0\\ \sum_{i=0}^{n-1} \mathbf q_k^i l_i(1) - \mathbf q_{k+1} = 0 \end{array}\right. $$


      Within odeSPVI I implemented a geometric integrator for Hamiltonian systems, like odeSE and odeSV, which uses spectral variational integrators with any order for polynomials of the basis and with any number of internal quadrature nodes, for both unidimensional and multidimensional problems.

      This solver just uses the stepper spectral_var_int.m which is the function where I implemented the resolution of the system displayed above; hamilton_equations must be a function_handle or an inline function where the user has to implement Hamilton's equations of motion: $$ \left\{\begin{array}{l} \dot{\mathbf q} = \frac{\partial H}{\partial \mathbf p}(\mathbf q,\mathbf p)\\ \dot{\mathbf p} = - \frac{\partial H}{\partial \mathbf q}(\mathbf q,\mathbf p) + \mathbf f(\mathbf q,\mathbf p)\end{array}\right.$$ where \(H(\mathbf q,\mathbf p)\) is the Hamiltonian of the system and \(\mathbf f(\mathbf q,\mathbf p)\) is an optional forcing term.

      This below is the implementation of the stepper:

      function [t_next,x_next,err]=spectral_var_int(f,t,x,dt,options)
      fsolve_opts = optimset('Jacobian','on');

      q_dofs = odeget(options,'Q_DoFs',[],'fast');
      p_dofs = odeget(options,'P_DoFs',[],'fast');
      nodes = odeget(options,'Nodes',[],'fast');
      weights = odeget(options,'Weights',[],'fast');
      leg = odeget(options,'Legendre',[],'fast');
      deriv = odeget(options,'Derivatives',[],'fast');
      extremes = odeget(options,'Extremes',[],'fast');

      dim = length(x)/2;
      N = q_dofs+p_dofs+2;
      q0 = x(1:dim)(:)';
      p0 = x(dim+1:end)(:)';

      SPVI = @(y)svi_system(y,q_dofs,p_dofs,weights,leg, ...
      y0 = [q0;zeros(q_dofs-1,dim);ones(p_dofs,1)*p0;q0;p0];
      y0 = reshape(y0,dim*N,1);
      z = fsolve(SPVI,y0,fsolve_opts);
      %z = inexact_newton(SVI,y0,new_opts) %new_opts must be set
      z = reshape(z,N,dim);

      y = [leg*z(1:q_dofs,1:dim),z((q_dofs+1):(end-2),1:dim)];
      x_next = [y;z(end-1,:),z(end,:)]';
      t_next = [t+(dt/2)*(nodes+1), t+dt]';

      q_dofs_err = odeget(options,'Q_DoFs_err',[],'fast');
      p_dofs_err = odeget(options,'P_DoFs_err',[],'fast');
      nodes_err = odeget(options,'Nodes_err',[],'fast');
      weights_err = odeget(options,'Weights_err',[],'fast');
      leg_err = odeget(options,'Legendre_err',[],'fast');
      deriv_err = odeget(options,'Derivatives_err',[],'fast');
      extreme_err = odeget(options,'Extremes_err',[],'fast');

      N_err = q_dofs_err+p_dofs_err+2;
      q1 = x_next(1:dim,end)(:)';
      p1 = x_next(dim+1:end,end)(:)';

      SPVI = @(y)svi_system(y,q_dofs_err,p_dofs_err, ...
      weights_err,leg_err,deriv_err,extreme_err, ...
      p_interp = zeros(p_dofs_err,dim);

      p_lower_order = [p0;z(q_dofs+1:q_dofs+p_dofs,:);p1];
      for i=1:1:p_dofs_err
      p_interp(i,:) = .5*(p_lower_order(i,:) + ...

      y0 = [z(1:q_dofs,:);zeros(1,dim);p_interp;q1;p1];
      y0 = reshape(y0,dim*N_err,1);
      z = fsolve(SPVI,y0,fsolve_opts);
      %z = inexact_newton(SVI,y0,new_opts) %new_opts must be set
      z = reshape(z,N_err,dim);

      x_est = [z(end-1,:),z(end,:)]';
      err = norm(x_est-x_next(:,end),2);
      At lines 22 and 56 I put a comment line to show that it is possible to solve the implicit system also with inexact_newton (which is the one explained in the previous post on backward Euler), so it's not mandatory to use only fsolve. I will make it an option in order to let the user choose which solver to use.

      Another important aspect to point out is that in case of an adaptive timestep the error is estimated using a new solution on the same timestep but with polynomials one order higher and one more internal node for the quadrature rule. Furthermore, for this last solution, the starting guess for fsolve is chosen in a non-trivial way: at line 53 we see that y0 has in the first q_dofs_err-1 rows the same modal values calculated before for the new solution x_next and a row of zeros just below. Then the starting nodal values for the momenta are set (lines 47-51) as a trivial average of new solution nodal values. This can seem wrong but I empirically stated that the number of iterations done by fsolve are the same of cases in which the reinitialization of nodal values is more sophisticated, so that is less computationally expensive to do a trivial average.

      In the following code is shown the implementation of the system to be solved:

      function [F,Jac] = svi_system(y,q_dofs,p_dofs,w,L,D,C, ...
      F = zeros(N*dim,1);
      V = zeros(p_dofs,dim*2);
      X = zeros(dim*2,1);
      W = reshape(y,N,dim);
      for i = 1:1:p_dofs
      X = [L(i,:)*W(1:q_dofs,:),W(i+q_dofs,:)]';
      V(i,:) = f(t,X);
      for i = 1:1:dim
      F((1+N*(i-1)):(q_dofs+N*(i-1)),1) = (ones(1,p_dofs)* ...
      (((w.*y((q_dofs+1+N*(i-1)):(q_dofs+p_dofs+N*(i-1))))* ...
      ones(1,q_dofs)).*D + (((dt/2).*w.*V(:,i+dim))* ...
      ones(1,q_dofs)).*L) + (p0(i)*ones(1,q_dofs)).*C(1,:) - ...
      F(1+N*(i-1)+q_dofs:N*(i-1)+q_dofs+p_dofs,1) = V(:,i) - ...
      F(N*i-1) = C(2,:)*y((1+N*(i-1)):(q_dofs+N*(i-1)))-y(N*i-1);
      F(N*i) = C(1,:)*y((1+N*(i-1)):(q_dofs+N*(i-1))) - q0(i);
      flag = 0;
      flag = 1;
      DV = zeros((dim*2)^2,p_dofs);
      if( flag==1 )
      for i = 1:1:p_dofs
      X = [L(i,:)*W(1:q_dofs,:),W(i+q_dofs,:)]';
      DV(:,i) = differential(f,t,X);
      for i = 1:1:p_dofs
      X = [L(i,:)*W(1:q_dofs,:),W(i+q_dofs,:)]';
      [~,DV(:,i)] = f(t,X);
      DV = DV';
      for u=1:1:dim
      Here, at line 46, I don't show the implementation of the Jacobian of the implicit system because, with this visualization style, it may appear very chaotic. The important aspect is that the user can use the implemented Jacobian to speedup the computation. I stated that it really allows to speedup the computation when using fsolve as solver. Furthermore the code tries to see if the user has implemented, in the function defining the Hamilton's equations, also the Hessian of the Hamiltonian (which is needed int the computation of the Jacobian of the system). If the function passes as second parameter the Hessian, then the program uses that one and the computation is faster, otherwise it approximates the Hessian with the following function and the computation will be slower:
      function Hamilt = differential(f,t,x)
      f_x = f(t,x);
      dim_f = length(f_x);
      dim_x = length(x);
      if( dim_f ~= dim_x )
      error('not implemented yet');
      Hamilt = zeros(dim_f*dim_x,1);
      delta = sqrt(eps);
      for i = 1:1:dim_f
      for j = i:1:dim_x
      Hamilt(i+(j-1)*dim_f) = (1/delta)*(f(t,x+delta* ...
      [zeros(j-1,1);1;zeros(dim_x-j,1)])(i) - f_x(i));
      Hamilt(j+(i-1)*dim_x) = Hamilt(i+(j-1)*dim_f);

      The Hessian of the Hamiltonian must be stored in a particular way (that I have to optimize yet, but the actual one works fine too) which is showed in the following example which is the definition of the Hamilton's equations for the armonic oscillator:

      function [dy,ddy] = armonic_oscillator(t,y)
      dy = zeros(size(y));
      d = length(y)/2;
      q = y(1:d);
      p= y(d+1:end);

      dy(1:d) = p;
      dy(d+1:end) = -q;

      H = eye(2*d);
      ddy = transf(H);

      function v = transf(H)
      [r,c] = size(H);
      v = zeros(r*c,1);
      for i=1:1:r
      for j=1:1:c
      v(i+(j-1)*r) = H(i,j);

      by Roberto Porcù ( at December 09, 2013 08:20 AM

      December 02, 2013

      Roberto Porcù

      Symplectic Euler (semi-implicit Euler), Velocity-Verlet and Stormer-Verlet methods

      Symplectic Euler method

      The symplectic Euler method is a modification of the Euler method and is useful to solve Hamilton's equation of motion, that is a system of ODE where the unknowns are the generalized coordinates q and the generalized momenta p. It is of first order but is better than the classical Euler method because it is a symplectic integrator, so that it yelds better results.

      Given a Hamiltonian system with Hamiltonian H=H(t;q,p) then the system of ODE to solve writes: $$\left\{\begin{array}{l} \dot{q} = \frac{dH}{dp}(t;q,p) = f(t;q,p)\\ \dot{p}=-\frac{dH}{dq}(t;q,p) = g(t;q,p) \end{array}\right. $$

      From E. Hairer, C. Lubich and G. Wanner, "Geometric Numerical Integration" we can state that the semi-implicit Euler scheme for previous ODEs writes: $$ \left\{\begin{array}{l} p_{k+1} = p_k + h g(t_k;q_k,p_{k+1}) \\ q_{k+1}=q_k + h f(t_k;q_k,p_{k+1}) \end{array}\right. $$ If g(t;q,p) does not depend on p, then this scheme will be totally explicit, otherwise the first equation will be implicit and the second will be explicit.

      The function that I created to make use of this integrator is odeSE.m and it passes to the integrate functions the stepper defined in the function symplectic_euler.m. The signature of odeSE is similar to those of the other ode solvers:

      It's important to know that y0 must be a vector containing initial values for coordinates in its first half and initial values for momenta in its second half; the function handle (or inline function) ode_fcn must take exactly two input arguments (the first is the time and the second is the vector containing coordinates and momenta such as in y0) and returns a vector containing dq/dt in its first half and dp/dt in the second half.

      options variable can be set with odeset and, if the system of ODE is explicit, the field options.Explicit can be set to 'yes' in order to speedup the computation. If tspan has only one element, then options.TimeStepNumber and options.TimeStepSize must not be empty, so that it will be used the integrate function integrate_n_steps.

      This is the code of the stepper symplectic_euler.m:

      function [x_next,err] = symplectic_euler(f,t,x,dt,options)
      dim = length(x)/2;
      q = x(1:dim);
      p = x(dim+1:end);
      if( strcmp(options.Explicit,'yes') )
      p_next = p + dt*(f(t,[q; p])(dim+1:end));
      q_next = q + dt*f(t,[q; p_next])(1:dim);
      x_next = [q_next; p_next];
      if(nargout == 2)
      dt_new = dt/2;
      p_next = p + dt_new*(f(t,[q; p])(dim+1:end));
      q_next = q + dt_new*f(t,[q; p_next])(1:dim);
      q = q_next;
      p = p_next;
      t = t+dt_new;
      p_next = p + dt_new*(f(t,[q; p])(dim+1:end));
      q_next = q + dt_new*f(t,[q; p_next])(1:dim);
      x_est = [q_next;p_next];
      err = norm(x_next-x_est,2);
      SE1 = @(y) (y-p-dt*(f(t,[q; y])(dim+1:end)));
      p_next = fsolve(SE1,zeros(size(p)));
      q_next = q + dt*f(t,[q; p_next])(1:dim);
      x_next = [q_next; p_next];
      if(nargout == 2)
      dt_new = dt/2;
      SE1 = @(y) (y-p-dt_new*(f(t,[q; y])(dim+1:end)));
      p_next = fsolve(SE1,zeros(size(p)));
      q_next = q + dt_new*f(t,[q; p_next])(1:dim);
      q = q_next;
      p = p_next;
      t = t+dt_new;
      SE1 = @(y) (y-p-dt_new*(f(t,[q; y])(dim+1:end)));
      p_next = fsolve(SE1,zeros(size(p)));
      q_next = q + dt_new*f(t,[q; p_next])(1:dim);
      x_est = [q_next;p_next];
      err = norm(x_next-x_est,2);

      Velocity-Verlet method

      The velocity-Verlet method is a numerical method used to integrate Newton's equations of motion. The Verlet integrator offers greater stability, as well as other properties that are important in physical systems such as preservation of the symplectic form on phase space, at no significant additional cost over the simple Euler method.

      If we call x the coordinate, v the velocity and a the acceleration then the equations of motion write: $$\left\{ \begin{array}{l} \frac{dx}{dt} = v(t,x)\\ \frac{dv}{dt} = a(t,x) \end{array} \right. $$

      So that, given the initial conditions (coordinates and velocities), the velocity-verlet scheme writes: $$ \left\{ \begin{array}{l} x_{k+1} = x_k + h v_k + 0.5 h^2 a_k\\ v_{k+1} = v_k + 0.5 h (a_k + a_{k+1}) \end{array}\right. $$ where $$ a_{k+1} = a(t_{k+1},x_{k+1})\ .$$

      This method is one order better than the symplectic Euler method. The global error of this method is of order two. Furthermore, if the acceleration indeed results from the forces in a conservative mechanical or Hamiltonian system, the energy of the approximation essentially oscillates around the constant energy of the exactly solved system, with a global error bound again of order two.

      The function that uses velocity-Verlet scheme is odeVV.m and the stepper called at each iteration is velocity_verlet.m. The signature of odeVV is the same of those of others ODE solvers:


      The documentation of input arguments is the same descripted in the previous symplectic Euler section, but there is the difference that now the function ode_fcn must return a vector containing the velocities in its first half and the expression of the acceleration in the second half.

      This is the code of the stepper symplectic_euler.m:

      function [x_next,err] = velocity_verlet(f,t,x,dt,options)
      dim = length(x)/2;
      q = x(1:dim);
      v = x(dim+1:end);
      a = f(t,x);
      q_next = q + dt*v + .5*dt*dt*a(dim+1:end);
      v_next = v + .5*dt*((a+f(t+dt,[q_next; v]))(dim+1:end));
      x_next = [q_next; v_next];
      if(nargout == 2)
      dt_new = dt/2;
      q_next = q + dt_new*v + .5*dt_new*dt_new*a(dim+1:end);
      v_next = v + .5*dt_new*((a + ...
      t = t+dt_new;
      q = q_next;
      v = v_next;
      a = f(t,[q; v]);
      q_next = q + dt_new*v + .5*dt_new*dt_new*a(dim+1:end);
      v_next = v + .5*dt_new*((a + ...
      x_est = [q_next; v_next];
      err = norm(x_next - x_est,2);

      Stormer-Verlet method

      The Stormer-Verlet scheme is a symplectic integrator of order two, useful to integrate Hamiltonian systems in the form described in the previous symplectic Euler section. It is characterized by the approximation of the integral defining the discrete Lagrangian $$ L_d(q_k,q_{k+1})\approx\int_{t_k}^{t_{k+1}}L(t;q,\dot{q})dt $$ with the trapezoidal rule. So that $$ L_d(q_k,q_{k+1}) = \frac{h}{2}\bigg( L\Big(q_k,\frac{q_{k+1}-q_k}{h}\Big) + L\Big(q_{k+1},\frac{q_{k+1}-q_k}{h}\Big) \bigg)\ .$$

      Considering again the system: $$ \left\{\begin{array}{l} \dot{q} = \frac{dH}{dp}(t;q,p) = f(t;q,p)\\ \dot{p}=-\frac{dH}{dq}(t;q,p) = g(t;q,p) \end{array}\right. \ ,$$ the scheme implemented for this integrator (described in E. Hairer, C. Lubich and G. Wanner, "Geometric Numerical Integration") writes as follows: $$ \left\{\begin{array}{l} p_{k+\frac{1}{2}} = p_k + \frac{h}{2}g(t_k;q_k,p_{k+\frac{1}{2}}) \\ q_{k+1}=q_k+\frac{h}{2}\Big( f(t_k;q_k,p_{k+\frac{1}{2}}) + f(t_{k+1};q_{k+1},p_{k+\frac{1}{2}})\Big) \\ p_{k+1} = p_{k+\frac{1}{2}} + \frac{h}{2} g(t_{k+\frac{1}{2}};q_{k+1},p_{k+\frac{1}{2}}) \end{array}\right. $$

      The function in which this method is implemented is odeSV.m and it calls the stepper stormer_verlet.m. The documentation is the same of odeSE.m. Its implementation is the following:

      function [x_next,err] = stormer_verlet(f,t,x,dt,options)
      dim = length(x)/2;
      q = x(1:dim);
      p = x(dim+1:end);
      if( strcmp(options.Explicit,'yes') )
      p_mid = p + .5*dt*(f(t,[q; p])(dim+1:end));
      q_next = q + .5*dt*((f(t,[q; p_mid])(1:dim))+ ...
      p_next = p_mid +.5*dt*(f(t+dt,[q_next;p_mid])(dim+1:end));
      x_next = [q_next; p_next];
      if(nargout == 2)
      dt_new = dt/2;
      p_mid = p + .5*dt_new*(f(t,[q; p])(dim+1:end));
      q_next = q + .5*dt_new*((f(t,[q; p_mid])(1:dim))+ ...
      p_next = p_mid + .5*dt_new* ...
      q = q_next;
      p = p_next;
      t = t+dt_new;
      p_mid = p + .5*dt_new*(f(t,[q; p])(dim+1:end));
      q_next = q + .5*dt_new*((f(t,[q; p_mid])(1:dim))+ ...
      p_next = p_mid + .5*dt_new* ...
      x_est = [q_next; p_next];
      err = norm(x_est - x_next,2);
      SV1 = @(y) (y - p - .5*dt*(f(t,[q;y])(dim+1:end)));
      p_mid = fsolve(SV1,zeros(size(p)));
      SV2 = @(y) (y - q - .5*dt*((f(t,[q;p_mid])(1:dim))+ ...
      q_next = fsolve(SV2,q);
      p_next = p_mid + .5*dt* ...
      x_next = [q_next;p_next];
      if(nargout == 2)
      dt_new = dt/2;
      SV1 = @(y) (y - p - .5*dt_new*(f(t,[q;y])(dim+1:end)));
      p_mid = fsolve(SV1,zeros(size(p)));
      SV2 = @(y) (y - q - .5*dt_new* ...
      q_next = fsolve(SV2,q);
      p_next = p_mid + .5*dt_new* ...
      q = q_next;
      p = p_next;
      t = t+dt_new;
      SV1 = @(y) (y - p - .5*dt_new*(f(t,[q;y])(dim+1:end)));
      p_mid = fsolve(SV1,zeros(size(p)));
      SV2 = @(y) (y - q - .5*dt_new* ...
      q_next = fsolve(SV2,q);
      p_next = p_mid + .5*dt_new* ...
      x_est = [q_next; p_next];
      err = norm(x_next-x_est,2);

      by Roberto Porcù ( at December 02, 2013 05:15 AM

      November 15, 2013

      Roberto Porcù

      Ode solvers options

      Ode solvers options

      What I want to do at this moment of the project is to add to all the solvers I've written since now, the ability to manage all the possible options of the ODE structure (defined previously in odeset structure) that are already managed in corresponding MATLAB ode solvers; so that the solvers I've written will be totally MATLAB-compatible and this will be the same also for the new solvers I'm going to implement.

      by Roberto Porcù ( at November 15, 2013 08:13 AM

      November 12, 2013

      Carnë Draug

      Snapshot release of image package 2.1.1

      The image package has accumulated a lot of changes since its last release and I’m hoping to make a new release soon (to match the release of Octave 3.8.0). I have prepared a snapshot tarball (version 2.1.1) with the current development status which can be installed easily with pkg. Would be great if users of the image package could give it a try and report any problems that the many changes may have caused.

      It is important to note that this is not a final release. To make this clear, and to avoid that this becomes distributed as if it was, I made it dependent on an yet unreleased version of Octave (it is dependent on the development version of Octave anyway), and made it print a warning about it every time the package is loaded. After reading this, download the tarball and install it with pkg install -nodeps /path/to/tarball

      I am partial to the changes in the new version, so these are the ones that I paid more attention to:

      • complete rewrite of imdilate and imerode with a performance improvement and many compatibility bugs fixed;
      • implementation of the @strel class and support for it in the image package functions;
      • support for N-dimensional matrices in several functions;
      • rewrite of the block processing functions which in some cases performs 1000x faster.

      There are also a lot of bug fixes and new functions. Some will break backwards compatibility really bad but needed to be done for the sake of Matlab compatibility. For example, bwconncomp was returning indices for object perimeter but should have been returning indices for all elements in the objects. So do take a look at the NEWS file or use news image after installing the package.

      After this release, I plan to follow the Octave core release method of keeping two development branches: a stable branch for minor releases with small bug fixes and regressions, and a default branch for big changes. Hopefully that will allow for more frequent releases as things of different levels get ready.

      Please report any problems you have found either in Octave’s bug or patch trackers.

      by carandraug at November 12, 2013 03:38 AM

      November 11, 2013

      Roberto Porcù



      The following list explains the main points of the timeline for my SOCIS project:

      1. adapt odeset and odeget from Octave odepkg to be completely MATLAB-compatible: all option names supported by MATLAB should be available; furthermore we want the possibility to add new option names that will be useful for example for symplectic integrators;
      2. implement a new code structure for one-step integrators ode12, ode23, ode45, odefwe (forward Euler), odebwe (backward Euler) and inexact solvers, subdividing the execution into 'blocks' in order to be able to do an easier optimization and an easier extension to new functionalities; in particular implement the steppers which will be the functions called to do one integration step and then implement the integrate functions which will integrate, in different ways, over the whole time interval, calling the respective steppers. Afterwards adapt the interface of ode45 to be completely MATLAB-compatible;
      3. implement MATLAB-compatible version of deval;
      4. implement the following geometric integrators as new steppers:
        • symplectic Euler;
        • Stormer-Verlet;
        • velocity-Verlet;
        • spectral variational integrators;
        • SHAKE;
        • RATTLE.
      5. optimization, error checking and documentation of all the code written.

      At the moment point 1 is completed; in point 2 I still have to adapt the interface of ode45 and to refine the implemented code. Point 3 has to be done. In point 4 I have implemented sympelctic Euler, Stormer-Verlet and spectral variational integrators but these need to be adapted to the code structure 'steppers', 'integrate functions'; the other integrators must be implemented. Finally, during the implementation I will start to write a bit of documentation and at the end I will do optimizations and debugging.

      by Roberto Porcù ( at November 11, 2013 02:17 PM

      October 25, 2013

      Roberto Porcù

      Starting Stepsize

      Starting Step Size

      The implementation of integrate_adaptive function takes (for now) the first timestep as last input argument. Obviously the choice of the first timestep made by the user may be not the best, even though the adaptive technique will adjust the step at the second iteration. So that there exist many techniques to find out a good timestep for the first iteration.

      Here I present the one I implemented, which was proposed by Gladwell, Shampine and Brankin in 1987. It is based on the hypothesis that:$$ \text{local error}\approx Ch^{p+1}y^{(p+1)}(x_0) $$ The algorithm is the following, and usually gives a good guess for the initial timestep.

      function h = starting_stepsize(f,x0,t0,p)
      d0 = norm(x0,2);
      d1 = norm(f(t0,x0),2);
      if( d01.e-5 || d11.e-5 )
      h0 = 1.e-6;
      h0 = .01*(d0/d1);
      x1 = x0 + h0.*f(t0,x0);
      d2 = (1.0/h0)*norm(f(t0+h0,x1)-f(t0,x0),2);
      if( max(d1,d2)= 1.e-15 )
      h1 = max(1.e-6,h0*1.e-3);
      h1 = (1.e-2 / max(d1,d2))^(1/(p+1));
      h = min(100*h0,h1);

      So that, what I will do is include this algorithm into integrate_adaptive function which will also have a new signature, because the initial timestep is no more needed:

      function [x,t] = integrate_adaptive(stepper,order,f,x0,t0,t1)

      by Roberto Porcù ( at October 25, 2013 08:26 AM

      October 06, 2013

      Wendy Liu

      Sudoku code: another encryption scheme

      Another useless encryption scheme devised by yours truly. While my last one (pi code) was primarily a substitution cipher, this one transcends all standard classifications; it's almost like a transposition cipher, but not really. The main idea here is the use of certain numbers in a particular Sudoku grid. The strength (if any) in this method lies in its unexpected nature; it certainly takes quite a leap of the imagination to correctly deduce the method from the ciphertext. Of course, once the method has been discovered, deciphering merely involves solving a Sudoku grid and then figuring out the substitution cipher used, meaning that the key is easy to determine and so this method kind of just looks at Kerckhoffs's Principle and then keeps walking. But that's okay, that's why this is filed under useless.

      The setup

      The first step is to find a Sudoku puzzle. I happened to have a Sudoku game installed, so I just ran that and started a random puzzle, which looked like this:

      5 1 7 8
      3 1 9 5 4
      3 1 5
      2 8
      4 7 5 9
      9 4
      6 5 2
      2 6 5 4 8
      4 3 7 2

      The (unique) solution to the above puzzle looks like this:

      5 2 4 1 7 9 3 8 6
      3 1 9 6 5 8 4 2 7
      8 6 7 3 2 4 9 1 5
      2 9 5 4 8 3 6 7 1
      4 8 6 7 1 5 2 3 9
      1 7 3 2 9 6 8 5 4
      6 5 1 8 4 2 7 9 3
      7 3 2 9 6 1 5 4 8
      9 4 8 5 3 7 1 6 2

      Clearly, each solved grid can be represented by a sequence of 81 numbers. Furthermore, due to the uniqueness of the solution, the initial grid and the solved grid are equivalent in the sense that the initial grid uniquely determines the solved grid. So we represent the initial grid as follows (with 0s standing for spaces):


      This results in the following solution grid:


      Now, depending on the length of the plaintext, we can choose somewhere between one and five numbers to use as the "holes" in the grid. For example, if we had a plaintext of 45 characters or fewer, we could choose the numbers 1, 2, 5, 6 and 9. Removing those numbers from the grid would leave us with this:

      4 7 3 8
      3 8 4 7
      8 7 3 4
      4 8 3 7 1
      4 8 7 3
      7 3 8 4
      8 4 7 3
      7 3 6 4 8
      4 8 3 7

      This has an entirely different arrangement of empty spaces than the initial grid. We can represent it using the following string:


      We can then use the empty spaces to write our message, and add characters in the rest of the spaces to try and flatten frequencies. A simple substitution cipher should be used as well to prevent the ciphertext from looking too much like a transposition cipher. The alphabetic characters can then be interspersed with the numerals above, with the resulting string separated into chunks of 32 characters (with extra alphabetic characters appended as necessary, since only 81 are needed) so as to superficially resemble MD5 (because that's just fun).

      Optimally, the solutions to the possible Sudoku grids being used should be stored in a dictionary somewhere, so that each grid only has to be solved once. The point of this encryption scheme is to be diabolically tricky but solvable, although perhaps only if you figure out the Sudoku aspect and know a crib or two.

      Encoding and decoding a sequence

      Let's say you want to encode:

      admiral yamamoto to visit solomon islands eight am

      First, remove spaces:


      This is 43 characters, so let's aim for one grid with 5 numbers. This means there will be 2 extra invalid characters and 36 nonsense characters. Since some spaces will be fairly large, it's a good idea to use a simple substitution cipher as well (or even a less simple one like Vigenère for additional fiendishness). In this case, we'll use the reverse alphabet cipher:


      So the encrypted plaintext becomes


      Now put it in the grid:

      z w 4 n 7 r 3 8 i
      3 z o b z 8 4 n 7
      8 z 7 3 n 4 l g l
      g l e 4 8 3 r 7 h
      4 8 r 7 g h l 3 o
      l 7 3 n l m 8 r 4
      h o z 8 4 m 7 w 3
      7 3 h v r t s 4 8
      g 4 8 z 3 7 n c a

      (The last two characters are the extra invalid characters used to fill up the empty spaces.)

      Now replace the remaining numbers with random letters, chosen so as to flatten the letter frequencies as much as possible:

      z w e n a r o t i
      q z o b z a e n s
      f z t r n m l g l
      g l e e a s r h h
      t r r a g h l m o
      l c c n l m z r v
      h o z l p m p w p
      a e h v r t s b d
      g g f z e q n c a

      Written as one line, the above grid would look like:


      Recall that the original grid looks as follows:


      With the numbers used added to the end of the grid, it might look like this:


      Now, we randomly combine the two: (split across two lines here due to design considerations)


      We could even put in extra characters at the end, since only the first 81 characters will be used:


      Here's how it would look if we split it up into 32-character chunks:

      z50w0ena1r7o008tiq0zo3bz1ae9nsf0 z5t0r4n0mlgl0g0le0e0as3rhh00tr0r ag1h5lm2ol00ccn0l8mzr0v0ho0z0400 lp7mpwp0a500eh9v0r0t0sbd090gg004 6f500ze02000q0020n6054c80400a3f7 0gsssg02ea1e5dllg9famezelr2ezz6x

      And there you go. Not indecipherable but certainly very misleading. To decipher, first separate the numerals from the alphabetic characters, then write out the first 81 characters within a 9x9 grid. Then fill another 9x9 grid with the first 81 numerals (one digit per box), omitting the zeros, and solve the resulting Sudoku puzzle. Take the remaining (non-zero) digits and highlight the positions in the grid corresponding to those digits. The characters corresponding to the non-highlighted positions can be discarded. Now it only remains to reverse the substitution cipher and insert spaces where necessary.


      dszb5c0ygo01ize7c0u0n80s31rwq9m0 5f0400ll0r00mw30atdx00r15gzs2000 800jq0z0n400bz7j05obo00rtbt90u0v f0u0oo9r0v00pmfvzoxxmg4650f00g20 0mnz000o20rgo605ub48ubzxxo040a03 70kteo02osz3dfle4eraosme2tn5sddq

      Implementation in Python

      All the relevant code is available in the repository for this website, under code/sudokucode. The 3 main files are shown below.

      import json
      import re
      import random
      import string
      A_ORD = ord('a')
      NOT_LETTERS_RE = re.compile('[^a-z]')
      LETTERS_RE = re.compile('[a-z]')
      NUMBERS_RE = re.compile('[0-9]')
      POSSIBLE_DIGITS = map(str, range(1, 10))
      # Does the reverse alphabet cipher thing on a string`
      def reverse_cipher(plaintext):
          real_letters = []
          for letter in plaintext:
              real_letters.append(chr(A_ORD + 25 - (ord(letter) - A_ORD)))
              # whatever it works don't h8
          return ''.join(real_letters)
      def contains_only_letters(plaintext):
          return is None
      def sometimes():
          return random.choice((True, False))
      class SudokuCoder:
          def __init__(self):
                  self.grids = json.load(open('grids.json'))
              except IOError:
                  # No grids in memory - can only encode, not decode.
                  self.grids = {}
          def encode(self, plaintext):
              plaintext = plaintext.lower()
              num_letters = len(plaintext)
              letters = reverse_cipher(plaintext)
              if not contains_only_letters(plaintext):
                  raise CannotEncodeError('Can only contain alphabetic characters.')
              if num_letters > 45:
                  raise CannotEncodeError('Maximum plaintext length: 45 characters.')
              # Randomly choose a grid.
              initial_grid = random.choice(self.grids.keys())
              grid = self.grids[initial_grid]
              # Randomly choose some digits to use as the holes.
              num_digits = num_letters / 9 + 1
              digits = random.sample(POSSIBLE_DIGITS, num_digits)
              letter_index = 0
              new_grid = []
              # Now replace all the hole digits with the plaintext.
              for digit in grid:
                  if digit in digits and letter_index < num_letters:
                      letter_index += 1
                      # Choose a random character
                      # For both the extra ones and the nonsense ones
              # Add extra characters depending on the number of digits
              for digit in xrange(num_digits):
              total_digits = initial_grid + ''.join(digits)
              # Now randomly combine them.
              grid_length = len(new_grid)
              total_length = grid_length * 2
              ciphertext = []
              letter_index = 0
              digit_index = 0
              while letter_index < grid_length or digit_index < len(total_digits):
                  if ((sometimes() and letter_index < grid_length) or
                      digit_index == len(total_digits)):
                      letter_index += 1
                      digit_index += 1
              return ''.join(ciphertext)
          def decode(self, ciphertext):
              ciphertext = ciphertext.lower()
              # Get the grid numbers (the first 81 digits).
              all_numbers = NUMBERS_RE.findall(ciphertext)
              initial_grid = ''.join(all_numbers[:81])
              hole_numbers = all_numbers[81:]
              all_letters = LETTERS_RE.findall(ciphertext)
              grid_letters = all_letters[:81]
              # Check if the solution to this initial grid exists.
              if not initial_grid in self.grids:
                  raise GridNotFoundError
              # Get the list indices of the hole numbers.
              solution_grid = self.grids[initial_grid]
              hole_indices = []
              for i in xrange(len(solution_grid)):
                  if solution_grid[i] in hole_numbers:
              hole_letters = [grid_letters[index] for index in hole_indices]
              plaintext = reverse_cipher(hole_letters)
              return plaintext
          def add_grid(self, initial, solution):
              if not initial in self.grids:
                  self.grids[initial] = solution
                  json.dump(self.grids, open('grids.json', 'w'))
                  raise GridAlreadyExists
      class GridNotFoundError(Exception):
      class CannotEncodeError(Exception):
      class GridAlreadyExists(Exception):

      import sys
      import utils
      if len(sys.argv) == 2:
          sudoku_coder = utils.SudokuCoder()
          print sudoku_coder.encode(sys.argv[1])
          print "Usage: python [plaintext]"

      import sys
      import utils
      if len(sys.argv) == 2:
          sudoku_coder = utils.SudokuCoder()
          print sudoku_coder.decode(sys.argv[1])
          print "Usage: python [ciphertext]"


      Instead of encrypting the Sudoku puzzle itself, use a standard one. for example, if you and your imaginary recipient both have access to the New York Times, which we will pretend has a daily Sudoku puzzle if it doesn't actually, then just use that. Then, the only numbers you'd have to send along with the plaintext (and the extra characters) are the hole digits. This means that the key itself (or at least the prelude to the key) does not have to be distributed along with the message, resulting in a somewhat more secure scheme (for a given value of "secure"). This will work as long as you ensure that the Sudoku puzzle that you choose always has a unique solution.

      by dellsystem (Wendy Liu) at October 06, 2013 04:00 AM

      September 26, 2013

      Carnë Draug

      GSoC 2013: imerode and imdilate (update #6)

      Since I deviated so much from the original project while fixing the image IO functions in Octave core, I decided to only focus on optimizing the imerode and imdilate in the final phase of GSoC. The reason is that these are at the core of many functions in the image package.

      On the original project it was planned to do all the work by expanding the __spatial_filtering__ function and that’s where I previously started. While doing so, it became evident that a complete rewrite was necessary. The convn() which could be used in most cases of binary images was performing much faster even though it was performing a more complex operation. As such, performing at least as fast as convn() became the target which was achieved:

      octave> se = [0 1 0; 1 1 1; 0 1 0];
      octave> im = rand (1000) > 0.5;
      octave> t = cputime (); for i = 1:100, a = imerode (im, se, "same"); endfor; cputime() - t
      ans = 2.1281
      octave> t = cputime (); for i = 1:100, a = convn (im, se, "same") == nnz (se); endfor; cputime() - t
      ans = 2.7802

      This implementation could be reused in __spatial_filtering__ to also speed up functions such as medfilt2, ordfilt2, and ordfiltn but there are specific algorithms for those cases which should be used instead.

      I have tested the new implementation of imerode against the last release (version 2.0.0) and the last development version that was still making use of __spatial_filtering__ (cset 6db5e3c6759b). The tests are very simple (test imerode), just random matrices with different number of dimensions. The new implementation seems to perform much faster in all cases, and shows a performance increase between 1.5-30x (output of test imerode). The differences are bigger for grayscale images (since imerode was already using convn for binary cases), and larger structuring elements (SE) with multiple dimensions.

      A couple of things:

      • in the latest release of the image package (version 2.0.0), there was no erosion for multidimensional binary images (only grayscale);
      • both development versions make use of the new strel class. One of the things that it does, its to decompose structuring elements automatically, hence why the tests use a cross rather than a square for SE;
      • I’m only testing with the shape “same” since version 2.0.0 only had that one;
      • when using binary images I test with different percentages of true values since the new implementation is sensitive to it;
      • I do not compare the results since I know the new implementations also fix some bugs, specially related to the border pixels.
      • imdilate uses exactly the same code and so I’m assuming that the differences from imerode are the same.
      version 2.0.0 (old) cset 6db5e3c6759b (dev) latest development (new) Performance old/new Performance dev/new
      2D binary image (90%) 0.009081  0.024602 0.006240  1.4551  3.9423
      2D binary image (10%)  0.007360  0.022881  0.004160  1.7692  5.5000
      3D binary image with 2D SE  NO! 0.481470  0.079125 n/a  6.0849
      3D binary image with 3D SE  NO!  0.518032  0.075605  n/a  6.8519
      7D binary image with 5D SE  NO!  13.940071  0.463229  n/a  30.093
      2D uint8 image  0.062324  0.043403  0.029322  2.1255  1.4802
      3D uint8 image with 2D SE  NO NO  0.430347  n/a n/a
      3D uint8 image with 3D SE  3.061951  1.725628  0.791569  3.8682  2.1800
      7D uint8 image with 3D SE  NO NO 2.005325 n/a n/a
      7D uint8 image with 7D SE  4.091456  2.940984  0.541634  7.5539  5.4298
      2D uint8 image with a larger SE  0.610678  0.305579  0.087445  6.9835  3.4945


      by carandraug at September 26, 2013 11:23 PM

      September 04, 2013

      Carnë Draug

      GSoC 2013: done with ImageIO – please test (update #5)

      The imageIO section of the original project was much shorter than this. Originally it was limited to implement imformats, and then expand it to do the writing and reading of multipage images in
      a Matlab compatible way which is just implementing the options Index and Frames in imread, and WriteMode in imwrite. The date of that last commit was 16th of July. However, I didn’t stop there and tried to fix other incompatibilities with Matlab and add new options.

      Here’s a list of the main changes:

      • dealing with transparency: alpha channel is now returned as the third output of imread and only accepted with the Alpha option of imwrite. Previously, the alpha channel would be considered the 2nd or 4th element on the 3rd dimension for grayscale and RGB images respectively. This prevented the distinction between a RGB with transparency and a CMYK image. Also, since GraphicsMagick has the transparency values inverted from most other applications (Matlab inclusive), we were doing the same. This has been fixed.
      • CMYK images: implemented reading and writing of images in the CMYK colorspace.
      • imread options: in addition to Index and Frames, also implemented PixelRegion, which may speedup reading when only parts of the image are of interest, and Info, for compatibility only, which doesn’t have any effect at the moment.
      • imwrite options: in addition to WriteMode, implemented Alpha (as mentioned above).
      • indexed images: implement writing of indexed images. Previously, indexed images would be converted to RGB which would then be saved. An indexed image will now always return indexes independently if a colormap was requested as output (previously, a raster image would be returned in such cases).
      • floating point and int32 images: GraphicsMagick is integer exclusively and we can’t differentiate between an image saved with int32 or floating point. Since the latter are far more common Octave will now return a floating point image when GraphicsMagick reports a bitdepth of 32. Previously, images with more than a reported bitdepth of 16 would not be read at all (note that this is dependent on the GM build options. If GM’s quantum depth was 16 or 8, the most common, GM would report a file with 32 bitdepth with the quantum depth it was built with so it would read files with a bitdepth of 32 as long as it was not built with quantum depth 32).
      • bitdepth: imread now returns the actual bitdepth of the image in the file rather than trying to optimize it. This is still not perfect but we’re making a better guess at it than before (this should be very simple but with GraphicsMagick it’s not).
      • imfinfo: some new fields, specially reading of Exif and GPS data which allowed to deprecate readexif in the image package.

      In the end, I recognize that there’s some incompatibilities left to fix but I don’t know how anymore. As I mentioned on my previous post, GraphicsMagick is too smart for our use. I predict that we will eventually have to move to something different, if for nothing else, to read and write floating point images without trouble. I only had a quick glance on alternatives but FreeImage would seem like a good bet. Of course, there’s also the possibility to write our library wrapper around the format specific libraries.

      Would be nice if users could throw some images to imread and imwrite and report any bugs. I’m specially afraid of regressions since there was so much stuff changed and we don’t really have tests for these functions yet.

      by carandraug at September 04, 2013 05:56 AM

      September 02, 2013

      Wendy Liu

      How not to check the validity of an email address

      Another update, after this post somehow hit #1 on Hacker News and garnered my website more visitors in one hour than it usually gets in a year: I got a lot of feedback on this post. While a lot of it was positive, there was also some valid criticism about the amount of flaming in this post. While at first my response was along the lines of "h8rs gunna h8 lol", after some thought I realised that while posts like this can be fun to write, they can also be counterproductive and even harmful to the developer community. I know that if someone had found some of my code and decided to publicly rip it apart, I would almost certainly feel hurt — even if it were deserved (like that time I wrote a custom CMS and made it vulnerable to SQL injection).

      I was young and foolish when I wrote this post. Admittedly, I'm less than a week older now, but hopefully much less foolish. Although I don't regret writing this post, I don't think I will be making posts of a similar nature in the future. While I do believe that bad practices in the developer community should be called out, there are undoubtedly ways of getting the same message across without resorting to vitriol or an overly snarky tone — even if the end result runs the risk of being less entertaining. Posts like this have a tendency to exacerbate a culture that is already mired in toxicity, and while recognising this fact cannot completely absolve me, I hope that it will at least deter others from behaving similarly.

      I considered taking this post down, but that seemed silly considering that 99% of people who will ever read this post have, in all likelihood, already read it. In addition, I am no longer afraid of being sued for defamation by the vendor, since they seem to have taken it in stride and — amazingly enough — have even expressed an interest in interviewing me for a developer position as a result of this blog post. If the original programmer ever reads this post, I apologise for the public ridicule at your expense. All I can say is that I hope you no longer write code like this. Though if you do, please send it to me and I will do my utmost to write a sincere, snark-free analysis of it. <3

      As a sidenote, one thing I found rather surprising is the fact that a significant number of readers seemed to interpret this post as an ode to my hatred of drugs. This would have been ideal if I ever intended to run for office, but since I don't, it's just perplexing; I would have thought the tone of the post made it obvious enough, but since it clearly wasn't, I will clarify that I merely took an over-the-top approach and ran with it, in the interest of writing an article that would be more interesting to read than "Here's some code I found". It is a fairly lazy literary technique, though, and probably not one I will be using again anytime soon. (Hey, I never claimed this post was a literary masterpiece. I promise it will get better, though.)

      Pretend, for a moment, that you're a newly hired programmer working on a popular learning management system called Hot4Learning. Your predecessor had been working on adding email functionality to the system, so that any user would be able to send an email to any other user at the school via a web interface. Unfortunately, your predecessor was recently hit by a bus and never managed to complete his magnum opus. Your task is to finish it by adding an email validation feature, to ensure that an email is sent only if the recipient is a valid email address associated with the school.

      For example: if Bob is a student at McGill University, then he should be able to send an email to any valid or email address. So if his friend Jane's school email is, then Bob should be able to send an email to that address. On the other hand, he should not be able to send an email to Nor should he be able to send an email to

      So your job is to write this feature, assuming you have access to a list of valid emails for the school.

      I hope you're thinking, "Please, that's easy. First, I'll create a string by joining the emails in the list with the string '|****|', then I'll pass that to the client by assigning the string to a variable within a <script> tag. Next, in the Javascript, I'll instantiate an Array object and assign it to a variable endearingly named temp, then throw away the aforementioned Array object and set temp to be the array created by splitting the string on '|****|'. Whenever a user enters an email, I'll convert it to lowercase and store it in a variable with the beguiling name curForwardUserName, and I'll create a variable called validUserName which I will set to be the string '0', because why use a boolean when you can use a string? Then, I'll loop through the entirety of temp; for each email address in temp, I'll convert that to lowercase, then check if it's equal to curForwardUserName. If so, I'll set validUserName to the string '1', and break. Finally, if validUserName equals the string '1', I'll consider curForwardUserName to be a valid email address; otherwise, it's invalid.

      Actually, no. That's a lie. I really hope that's not what you're thinking. I hope what you actually thought was along the lines of "Well, obviously I'd do the validation on the server side. I can store the valid emails in a data structure that allows O(1) membership testing, then check if the input email is contained in there." Because, well, that's the right way to do it.

      This is your brain

      In Python, the right way to do it would look something like this:

      if input_email in valid_emails_set:
          send_email(input_email, another_param, etc)

      This is your brain on drugs

      On the other hand, if you're working on a product called Hot4Learning, you probably hate your job and need to consume a staggering amount of drugs in order to keep coming to work. This is what your drug-addled brain might come up with:

      Warning: NSFWUYWOH (Not Safe For Work Unless You Work On Hot4Learning)

      // [some other code here]
      var userNamesStr = '|****| |****|
      (pretend there are 70,000 more emails here)|****|';
      var temp = new Array();
      temp = userNamesStr.split('|****|');
      var validUserName = '0';
      // [more code here]
      for( i =0; i< temp.length; i++){
      if( curForwardUserName == temp[i].toLowerCase()) {
      validUserName = '1';
      // [some last bits of wisdom before we go]

      UPDATE: Yes, the above code does involve sending all 70,000+ McGill email addresses to the client upon each page load. The source code of the page was over 2.5MB in size. As in, 2.5MB of pure text.

      Since your reviewer would likely be as stoned as you, you would get a quick LGTM and the above code would soon find its way into production.

      The actual story

      Last year, the IT department at McGill University replaced the old LMS (Blackboard) with a new system (which is not actually called Hot4Learning, but the actual moniker is no less trite or gimmicky so I'll spare you). Since the old system was riddled with vulnerabilities and suffered from general all-around shittiness, I was excited to see what the new system would be like. That excitement soon died, just like brain cells do when exposed to copious amounts of drugs.

      Upon clicking the "email" tab, I was greeted with 10 seconds of white screen as my browser tried to load the page. "Why on Earth is this taking so long?" I asked myself. "It can't be, like, trying to load every single valid email address at McGill, right? Lol. Haha. Hahaha. Hahahaha. Wait. ohgod"

      Words cannot express the horror I felt when I looked at the source of the page and found the code that I have reproduced above. The only time I've been more terrified was when I dreamt that I slept through my cryptography exam and then began to suffocate (but that turned out to just be the blanket).

      If you want to take a look at this code in the wild, well, you can't anymore. I reported it as a data leak vulnerability and the vendor actually fixed it within two weeks. However, the code as it appeared then is essentially what you see above, shitty whitespace and all; I just removed some code irrelevant to the functionality at hand and also removed the actual list of emails. Apologies to Zamboni Man if you exist and are a student at McGill.

      I would really like to know the combination and quantities of drugs consumed that resulted in this code. Do you know? Can you hook me up? Send me a message on Twitter @dellsystem.

      Legal addendum

      I hereby claim that the above snippet of code is fair dealing according to section 29 of the Copyright Act of Canada, which allows for usage for the purpose of research, private study, education, parody, satire, criticism or review and news reporting. In fact, this snippet was used for nearly all of these purposes:


      I stumbled upon this snippet while doing research for my B.S. thesis, "Not even once: Prolonged drug use and its effect on code quality". It will be the perfect B.S. thesis, complete with B.S. graphs and footnotes in which I beg the reader to take me on as a grad student.

      Private study

      I saved the source code of the page because I wanted to privately study the correlation between prolonged drug use and code quality.


      I would like to educate the public on the deleterious effects of prolonged drug use on code quality.


      I wish I could say that the code above is a parody of the actual source code I encountered. I really do. But I can't.


      The code above was used as part of a scathing satirical indictment of the modern programmer's tragic penchant for drugs. Also, society.

      Criticism or review

      3/10 would not bang

      News reporting

      BREAKING: Prolonged drug usage can result in terrible code. CNN had better link back to me when they pick up this story, I could use the ad revenue.

      by dellsystem (Wendy Liu) at September 02, 2013 04:00 AM

      July 30, 2013

      Carnë Draug

      GSoC 2013: GraphicsMagick (update #4)

      Octave uses GraphicsMagick (GM) for reading and writing of images, which gives us a unique interface to read many different formats through a simple and unique interface. We actually get support for more image formats than Matlab.

      However, I’m starting to think that GM may not be the best option, or even an option at all, if we wish to be Matlab compatible. The reason is that GM does not gives us methods to access information about the actual image that is on file, only about how the image is being stored. I’ll be listing the problems I have faced during the last few weeks while improving image IO in Octave.

      Note that I’m not arguing that GM is bad. We can use it to read pretty much any image, the problem is that what we read may be different from what was in the file. Octave needs to get the original image values and not one of the possible representations of that image. GM is doing a great job at being smart but we may need something more stupid.

      Quantum depth

      It is well know among those that use Octave for reading images, that we’re dependent on the quantum depth used to build GM. The way it works is that GM can be built with quantum 8, 16 and 32 and defines the type it uses to store the image values (uint8, uint16, and uint32 respectively). Independently of the original image values, GM will scale the values for the range of the type. If the file has 8 bit depth, and GM was built with quantum 16, we will need to fix the range (this seems to be one of the few cases where we can get the actual image information using the depth () method). As long as the images have their values as unsigned integers there is no problem but images are not limited to those types. Images with values as floating point become a problem (see bug #39249) since we loose all information about the range. Also, it appears that at least the JPEG 2000 format may use int16 instead of uint16 and I have no idea what to do about that.

      Image class

      GM defines two Image Class types: direct and pseudo class. Direct class is for a “normal” image where the image is “composed of pixels which represent literal color values“. Pseudo class if for indexed images where the image is “composed of pixels which specify an index in a color palette“, or colormap in Octave vernacular. The problem is that the class is not of the original image but based on what GM guesses will increase performance. This leads to some JPEG (a format that has no support for indexed images at all), being reported as PseudoClass probably because they have few unique colors. And we can’t make the decision based on the image format since some (such as TIFF and PNG) support both. If an image from that format reports being pseudo class, we have no method to find if GM is choosing that for performance or if it’s the original file class.

      Image type

      Similarly to the image class, GM also has image types. There’s many more image types which define the image of any class as bilevel, grayscale, RGB, or CMYK, with and without opacity (called matte in GM). Just like it does for the image class, GM guesses which type is better so a file with a RGB image that only uses one of the color channels will report being grayscale, or even bilevel if it only has two values on that channel (see bug #32986 and bug #36820).

      For now, there’s not much we can do. Adding methods to find information about the actual image in file through GM is not trivial and outside the scope of my project. I have already implemented the options I proposed to, the ones that allow reading and writing of multi-dimensional images.

      Fixing the problems of indexed images and alpha channels was nor part of the proposal but it makes sense to do it as I was going through the code. We are now Matlab compatible but in some cases there is nothing we can do, it’s just the way GM works.

      If we wish to be 100% Matlab compatible when reading images, some other strategy is needed. Whether that means improving GM or replace it with other libraries I don’t know. What we need is a C++ library for reading and writing of images in any formats and gives us access to the original image values.

      by carandraug at July 30, 2013 03:11 PM

      July 25, 2013

      Carnë Draug

      GSoC 2013: rewriting imerode and imdilate (update #3)

      I pride myself in writing heavily commented code. That’s not necessarily a good thing, maybe it shows that I need an explanation for things that others consider too trivial to comment, parts were the code comments itself.

      Still, sometimes I take a specific approach to a problem, find a problem with it and change it. At the start I thought it to be the best solution, I was unable to foresee the problem. Someone coming later may have the same initial idea as me, and attempt to implement it just to find the same problem. Provided the problem is found early enough, such things are never committed so one can’t simply look into the logs. That’s why I have a fair amount of comments on why code was not written in some other way.

      The following is a comment block that I wrote for the imdilate function (currently a m file) that will never be commited since I ended up rewriting the whole thing in C++. But at least I can reuse it for the blog post since it  explains a long standing problem with the morphology functions of the image package. Oh! And the comment was so long that I actually thought it deserved its own title on the source code.

          The problem of even sized and non-symmetric Structuring Elements
      Using convn() to perform dilation and erosion of binary images is a
      a *lot* faster than using __spatial_filtering__(). While that is
      probably an indication that __spatial_filtering__ could be improved,
      we still use convn() for both simplicity and performance.
      Erosion is simpler to understand because it's a simple spatial filter,
      not unlike stamping on the image. Now, dilation is the erosion of the
      image complement. This explains how for grayscale images, erosion and
      dilation are minimum and maximum filters respectively. The concept is
      quite simple as long as the SE is symmetric. If not, then there's the
      small detail of having to translate it, i.e., rotate around its center.
      But convn() already performs the translation so we don't do it for
      dilation (the one that would need it), but do it for erosion (because
      convn() will undo it by retranslating it).
      A second problem are even sized SE. When one of the sides has an even
      length, then we have more than one possible center. For Matlab
      compatibility, the center coordinates is defined as
            floor ([size(SE)/2] + 1)
      That's also what convn() uses which is good, but since we also need to
      do translation of the SE, small bugs creep in which are hard to track
      The solution for non-symmetric is simple, the problem arises with the
      even sizes. So we expand the SE with zeros (no effect on the filter)
      and cut the extra region in the output image.
      Note: we could use cross correlation since it's like convolution without
      translation of the SE but: 1) there is no xcorrn(), only xcorr() and
      xcorr2(); 2) even xcorr2() is just a wrapper around conv2() with a
      translated SE.

      After spending an entire day playing with the translation of SE‘s and padding of images, I decided to rewrite imdilate and imerode. Once you understand what these are actually doing you’ll see why.

      Consider erosion of a logical image as the simplest example. This is basically sliding the SE over the image, and check the values of the image under the true elements of the SE. If any of them is false, then the value at those coordinates on the output matrix will be false. There’s not even need to check the value of all elements under the SE, we can move to the next as soon as we find one that does not match.

      Convolution however, needs to multiply the value of the SE with the values on the image, and then sum them all together to get the value for the output matrix. And all of that in double precision. And still, it was performing faster. Just take a look at the following

      octave> im = rand (1000) > 0.3;
      octave> se = logical ([0 1 0; 1 1 1; 0 1 0]);
      octave> t = cputime (); a = convn (im, se, "valid") == nnz (se); cputime() - t
      ans =  0.12001
      octave> t = cputime (); c = __spatial_filtering__ (im, se, "min", zeros (3), 0); cputime() - t
      ans =  2.3641

      There’s a couple of reasons why __spatial_filtering__ is slower, but I think the main reason is that it was written to be of a more general tool. It has a lot of conditions that makes it quite useful for all the other functions while working nicely with images with any number of dimensions. But it’s 20 times slower, performing something that should be a lot simpler. Since dilation and erosion are at the core of most morphology operators, they are among the ones that should be optimized the most.

      So I wrote my own code for dilation in C++ and managed to make it a lot faster than __spatial_filtering__. Since my implementation does not actually evaluate all of the values its performance on the distribution of true elements in the input matrix. And while it’s a lot faster than __spatial_filtering__, it’s still not faster than convn.

      octave> im1 = rand (1000) > 0.3;
      octave> im2 = rand (1000) > 0.95;
      octave> t = cputime (); a = convn (im1, se, "valid") == nnz (se); cputime() - t
      ans =  0.11601
      octave> t = cputime (); b = imerode (im1, se, "valid"); cputime() -t
      ans =  0.21601
      octave> t = cputime (); c = __spatial_filtering__ (bb, se, "min", zeros (3), 0); cputime() -t
      ans =  2.3681
      octave> t = cputime (); a = convn (im2, se, "valid") == nnz (se); cputime() - t
      ans =  0.12001
      octave> t = cputime (); b = imerode (im2, se, "valid"); cputime() -t
      ans =  0.16001
      octave> t = cputime (); c = __spatial_filtering__ (im2, se, "min", zeros (3), 0); cputime() -t
      ans =  2.3681

      As can be seen, even under optimal conditions (when most of the input matrix is false, my implementation still performs slower than convn). But I’m not done with it yet.

      by carandraug at July 25, 2013 12:50 AM

      July 11, 2013

      Carnë Draug

      GSoC 2013: imformats and filenames (update #2)

      Since my last post, I’ve been working on the image IO abilities of Octave by implementing the missing function imformats(). The purpose of this function is to control imread(), imwrite(), and imfinfo(), by adding support for new image formats and/or changing how to act with existing ones.

      By default, Octave’s image IO functions (imread(), imwrite(), and imfinfo()) use the GraphicsMagick library for their operations. Through GraphicsMagick, Octave can support a vast number of image formats, a much larger number than the ones required for Matlab compatibility. However, because of the large ammount of image formats in science and their commonly closed nature, this is still not enough to support all image formats out of the box.

      Enter imformats()

      imformats() keeps an internal list of all the supported image formats, and what functions should be used to deal with them. What this mean is that when a user calls, for example, imread(), that function only needs to ask imformats() what function should be used to read the file, and then pass all the arguments to it.

      In my experience (mainly microscopy), the majority of the closed image formats in science are works around TIFF. For example, a lsm file is just a TIFF file. By default, Octave will recognize it as TIFF and try to read it as such. However it will fail because only half of the images inside the file are the actual images that are meant to be read. The other half are thumbnails to them and should be skipped. In another example, dv files are also just TIFF files but for some weird reason the images are inverted. Here’s how to read such files:

      if (strcmpi (extension,".lsm"))
        image = imread (file, 1:2:nPages*2);
      elseif (strcmpi (extension,".dv"))
        image = imread (file, 1:nPages);
        image = rotdim (image, 2, [1 2]);
        ## let imread decide
        image = imread (file, 1:nPages);
      Note that the actual code is a bit more complicated that what I’m showing in order to account for old versions of the formats which can’t be deduced by file extension alone.

      As can be seen, most changes required to make Octave able to correctly read other formats are small hacks around imread(). But we can’t simply add a new format to imformats instructing imread() to use a function that makes use of imread() or we will get stuck in an endless loop. Since the actual function performing the action is now separated, one must use the function handle returned by imformats(). imread() can now be configured to read the above formats with the following code:

      function image = read_lsm (filename, varargin)
        tif = imformats ("tif");
        ## piece of code to figure nPages from varargin
        image = (filename, 1:2:nPages*2);
      function image = read_dv (filename, varargin)
        tif = imformats ("tif");
        ## piece of code to figure nPages from varargin
        image = (filename, 1:nPages);
        image = rotdim (image, 2, [1 2]);
      lsm = dv = imformats ("tif");
    = @read_lsm;
      imformats ("add", "lsm", lsm);
    = @read_dv;
      imformats ("add", "dv", dv);

      This is something not meant to be done in the main code. It’s something that can be added to an .octaverc file, or even better, to the PKG_ADD scripts of an Octave package mean to add support to other image formats.

      The end result is Octave consistent code that abstracts himself from file formats.

      Filename and Mathworks bashing

      There’s a tiny detail that complicated things a bit. My understanding of filename is that it includes the file extension. However, someone at Mathworks seems to think that is open to discussion and made possible to define the extension as a separate argument. As if that was not weird enough, the extension is completely ignored if a file is found without trying to add the extension. Consider the following situation where you have two images, with and without extension.

      >> ls
      >> imread ("test_img");          # reads test_img
      >> imread ("test_img.tif");      # reads test_img.tif
      >> imread ("test_img2", "jpg");  # reads test_img2.jpg
      >> imread ("test_img", "tif");   # reads test_img! ????

      So yeah! Why would we pass the extension as a separate argument just to have it ignored? I may be wrong about the why of this API but my guess is that it’s required to support more arguments after the extension:

      >> imread ("test_img.tif", "Index", 1:10, "PixelRegion", {[250 350] [100 200]})

      If imread() can find a file without the extension, then whatever comes next must be the reading options, and not the extension. But then, why does imwrite() does the same? It certainly can’t make that decision based on whether the file already exists. So we have to check if we have an odd or even number of leftover arguments.

      And why the possibility to not specify the file extension in the first place? Again, just my guess. To cover the stupidity of people who don’t know about file extensions because their file managers hides them.

      Interestingly, the question whether file extension belongs to the filename is only relevant with image files. Other functions such as audioread(), csvread(), fopen(), or wavread() do not suffer of this.

      For more Matlab bashing, go read the abandon matlab blog.

      by carandraug at July 11, 2013 04:01 PM

      June 05, 2013

      Carnë Draug

      GSoC 2013: image processing of ND images (update #1)

      My project proposal for image processing of ND images in GNU Octave was accepted for Google Summer of Code 2013. If all goes as planned, not only will it be a very nice improvement to the image package but also a chapter on my PhD thesis.

      I decided to warm up by implementing montage() which basically displays all pages (or frames) of an ND image as a single 2D image. It is not part of the proposal but the coding period hasn’t started anyway, and it will be really useful for the rest of the project. So useful indeed, I had wrote my own version of it last year, completely unaware of Matlab’s implementation and of course, completely incompatible. So I had to do it all over again.

      Following the trend of changing 20 000 other small things first, my first contribution ended up being a fix for ind2rgb() causing a complaint in just two hours. Never had I made a commit with such immediate repercussions. Still, I’m taking it positively because:

      1. means I’m changing code that people use and care about;
      2. my fix was actually correct :)

      I guess the main problem users will have is understanding ND images. While images can have many dimensions, the standard is to have them as a 4D matrix, the multiple images concatenated into the 4th dimension, and the 3rd being reserved for the colorspace. Expect a new section for the Octave manual.

      To implement montage(), the most difficult part was dealing with a cell array of filepaths for images. Handling all different possibilities wasn’t trivial, trying to think of all possible combinations to create a correct image with all of them. In special, multipage indexed images are troublesome because I have never seen one. And because imread() seems to always return a single colormap, a possible Octave bug may exist when reading multipage indexed image where each page has different colormaps. Reading the TIFF6 file specifications such file could in theory exist. However, I don’t think they actually do.

      In the end, montage() is finished and added to the image package (r11933 and r11934). In addition to Matlab’s API, I included options to add margins between panels, and configure the colour for both margins and panel background. It’s far from being the smartest piece of code I ever wrote, but smart code would use a lot of permute(), reshape(), and indexing trickery, turning it into a maintenance problem.

      It took me 4 times longer than originally planned. I’m blaming the Irish weather.

      Unreal Irish weather.

      Ireland being famous for its non stop rain, has had amazingly sunny weather for 2 weeks now. Galway has been the world procrastination central as everyone’s out getting as much sun as possible.

      by carandraug at June 05, 2013 06:26 PM

      January 08, 2013

      Octave Forge

      image package: strel class, non-flat morphology and spatial transformations

      There's been a lot of development on the image package lately, spurred by contributions from Pantxo Diribarne and Roberto Metere. While these have not been tested enough for a release, they're on a state where we could welcome some testing. If you can't get development version from subversion, please give this tarball a try (prepared from revision 11551).

      strel class

      The strel class was something that I wanted to implement a long time ago but had never found the time. The name strel comes from structuring element (SE), the shapes used in morphological operations such as dilation and erosion. I have only seen it as a standard way to create SEs, but is actually much more. Specially, SE decomposition can have a really nice increase in performance.

      Roberto Metere submitted his own implementation of the class last month and we have been working on it, slowly adding it to the other functions of the package. It started as a single .m function, no OOP at all, but he managed to implement @strel with the old @class style while keeping matlab compatibility. All the basic methods have been implemented: the object constructor, getnhood, getheight and reflect.

      The idea behind SE decomposition is that morphology operations take as much space as the number of pixels in a SE. The bigger the SE, the slower it will be. However, some SE can be replaced by a sequence of smaller ones so it's in our best interest to use them. For an example on performance, see how the use of a square compares to use of 1 row and 1 column of the same size:

      octave> im1 = im2 = randp (5, 2000) > 15;
      octave> t = cputime ();
      octave> im1 = conv2 (im1, true (20), "same") > 0; # dilation by 1 square
      octave> cputime () - t
      ans =  2.6402
      octave> t = cputime ();
      octave> im2 = conv2 (im2, true (20, 1), "same") > 0; # dilation by 1 column
      octave> im2 = conv2 (im2, true (1, 20), "same") > 0; # dilation by 1 row
      octave> cputime () - t
      ans =  0.52803
      octave> isequal (im1, im2)
      ans =  1

      At the moment, decomposition is only being done for rectangular, square and cube shapes but other will come with time. Functions that can gain from SE decomposition, are written so that it does not matter if it has been implemented for a specific shape. This means that when it is implemented for another shape, its effect will be immediate across the whole image package. The only file where this is done is inst/@strel/getsequence.m so send us patches.

      Going through imdilate() and imerode() to make them use strel, brought up a bunch of other matlab incompatibilities that I hope are now fixed, as well as other improvements. I'm a bit interested in morphology of volumes so one of the changes made was making them work for N-dimensional images (think MRI scans).

      On top of the matlab shapes for strel, I implemented the cube as an optional shape. I also wanted to implement ball as a volume but unfortunately, matlab has already done it wrong as a non-flat ellipsoid. Note that non-flat is unrelated to volumes.

      Non-flat morphology

      This has confused me for a very long time. Because of the name (non-flat), and because 3D images are the norm for me, I have always assumed that a non-flat SE was one used for volumes. The fact that the only non-flat standard shape in matlat is named ball, which immediately brings up the idea of volume, also helped to the confusion.

      Actually, non-flat morphology is something that only makes sense for grayscale operations. A non-flat SE is defined by two different matrices, one defining the neighbourhood (same as a flat SE) and another defining the height of each neighbour. These heights are added to the image pixels values before the erosion and dilation, in the same way as the variable S in ordfiltn.

      Basically, not useful for volumes at all but the image package can do this now. To create a non-flat SE, use the arbitrary shape of strel.

      Spatial transformations

      Pantxo Diribarne has also submitted a set of functions for spatial transformations: maketform, cp2tform, tformfwd, and tforminv. These are still not completely implemented and generally restricted to 2D transforms. Adding the missing options should now be much easier.

      by Carnë Draug ( at January 08, 2013 06:45 AM

      January 04, 2013

      Wendy Liu

      Simulating the night sky with CSS

      Have you ever looked up at the sky on a clear, moonless night? When all you can see is an endless field of stars, glittering like an array of celestial diamonds speckled across a cosmic fabric? Did it make you think, "I wonder if I could simulate this in the browser using transparent images and CSS"?

      That has never happened to me either, actually. But it was the best segue into this subject that I could think of.

      Recently, I built the frontend for a visualisation of Christmas wishes on Twitter. My biggest consideration when thinking of a design was that I wanted to make it somewhat Christmas-themed while avoiding kitschiness as much as possible. I settled on a night sky backdrop, with falling snowflakes simulated using a modified version of this jQuery plugin (okay, a little kitschy, but I'd like to think it embodies a certain classiness as well). The backdrop ended up looking like this:

      It looks simple enough, but the technique is actually rather involved. Why? Because the sky is meant to expand to fill the user's browser. Creating a fixed-size image containing randomly-positioned stars would not suffice, because then that image would either have be tiled (which would be very noticeable, ruining the illusion of the endless starfield) or stretched (which could result in very large and pixellated stars for users with larger screens). Neither solution is desirable.

      So what's the alternative, if you want to preserve the illusion that these stars go on forever without having to actually generate an endless starfield? My solution was to separate the stars into three different layers: small stars, medium stars, and large stars. Each layer would be a mostly-transparent PNG image, with white dots for stars. And most importantly, the widths and heights of each layer would have to be relatively prime.

      In my original design (the one pictured above uses images that are half the size of the original ones, for illustration purposes), the dimensions were:

      • Small stars layer: 1103 wide × 541 high
      • Medium stars layer: 1009 wide × 479 high
      • Larger stars layer: 1051 wide × 524 high

      How many pixels of "randomness" do we get before it repeats? Well, all the numbers except 524 are prime, and we can see immediately that 541, 479 and 524 are relatively prime. So the non-repeating area has a width of 1103 × 1009 × 1051 = 1,169,686,277 pixels (the lcm of the individual widths), and a height of 541 × 479 × 524 = 135,788,836. The result is more than hundred million pixels of vertical scroll space and more than a billion pixels of horizontal scroll space before we encounter a pattern of stars that we've seen before. Not bad for less than 14KB of images. (In fact, these numbers are a lot higher than they need to be - anything above 300 or 400 pixels would probably work just as well, with the period being well over a million pixels.)

      So that's the basic idea. DesignFestival calls this the cicada principle, which is a catchy if somewhat silly name for something that is a basic property of numbers (there are some really cool examples here). If you want to use this concept in your own project, the HTML structure looks like this:

      <div id="stars">
          <div class="large"></div>
          <div class="medium"></div>
          <div class="small"></div>

      Feel free to grab the CSS and the images (stars_large.png, stars_medium.png, stars_small.png), though making your own images is pretty fun as well (I spent a few minutes pasting white dots at random positions in an image document in Inkscape; it was a blast). If you do use this in your own project, I would love to hear about it.

      by dellsystem (Wendy Liu) at January 04, 2013 05:00 AM

      November 06, 2012

      Octave Forge

      minutes from OctConf2012: pkg - package system and structure

      There's an Octave code sprint planned for the weekend of 17-18 of November with the purpose of improving the functionality of pkg(). Some of the improvements were discussed at OctConf2012 but more have since been discussed on the mailing list. At the time, I started writing a long report (in the style of a meeting minutes) about pkg() and Agora, the things that had a bigger impact for Octave Forge but only finished the part of pkg(). The comments I received were that the text was too long and detailed so I ended up writing a shorter text that covered both items.

      But with the code sprint coming up soon, we do need a proper document stating what pkg() should be doing. When we looked at it during OctConf, things were intertwined in such a way that any changes required fixes everywhere else. My guess is that if a bunch of people start coding away on it at the same time, even if on different problems, we will keep stepping on each other toes. And if we do create 1 branch for each sprinter, merging them back together might not be so easy. The ideal would be to have something like a python's PEP.

      In the mean time, I'll post here the minutes of the pkg() discussion during OctConf2012.

      These were first dicussed between Carnë Draug (CD), Juan Carbajal (JC) and Carlo de Falco (CF) before being presented to the community present at OctConf2012 on the morning of July 19 for further discussion. During the rest of the event CD, JC and CF continued discussing the plans whose conclusions are now reported.

      It was the opinion that the current problems with the pkg system are caused by the code complexity of pkg(), itself caused by the path of its development, slow, as new features were added as they were needed, one at a time on top of the previous ones. Also, the nature of the problem, mostly string processing and directories content, is not solved with the Octave functions with clean code. As such, a list of problems with the current system and new desired features was made to have a clear design of what the system should support.

      It was proposed by CD to rewrite pkg() in Perl. Despite the language fame for being hard to read, it would allow for shorter and fast code. It would be much easier to maintain by someone familiar with Perl than the current code is for Octave programmers. Even for someone unfamiliar with Perl, it should take less effort to fix bugs. Plus, perl is already an Octave dependency so Octave users will already have it installed on their systems. CF pointed out it is just a building dependency and therefore not necessarily present on the user system. While it is true that pretty much all Linux distributions require perl, it does not hold for Windows. pkg() is currently faulty on Windows so it wouldn't be a problem but the hope is to make it work for them too. The idea to use perl was then rejected.

      CD, CF and JC were of the opinion that the autoload option was not good and that pkg() should not support it. Packages can shadow core Octave functions, and even other packages functions. On the later case, no warning is given. Code meant to run with specific packages, or even with no packages at all, may catch others by surprise. Also, some users are not aware that some functions they use come from packages. Forcing them to load a packages as needed will make them know what they are doing. No other programming language has packages, modules or libraries loaded by default (with the exception of special cases such as python implementations). JC gave the example of a practical class where the teacher gives commands for the students in front of their pre-installed octave systems. The first command they should run should be pkg load and the professor should not have installed the package with autoloading by default. Any user would still be free to configure his .octaverc file to load packages at startup. That is the objective of .octaverc not of a package system, to configure startup of octave sessions. CD pointed that loading of packages is also not completely safe at the moment. When loading a package, its dependencies are loaded at the same time. However, these dependencies can be unloaded leaving those dependent on them loaded and not issuing a warning. The discussed options were: unload all other packages at the same time, refuse to unload the package, keep the current behaviour. The verbosity level for attempting to unload such package was also discussed but no conclusion was reached.

      A frequently requested option is to automatically identify, download and install all package dependencies. All CD, CF and JC agreed that this should be implemented. It shouldn't be too much work since the dependencies are already being identified and can be downloaded with the -forge flag. All code is already in place, it should only require a few extra lines. This is obviously only possible for dependencies on other packages. A check on the availability of external libraries and software can be performed with Makefiles but pkg() can't solve them.

      CF suggested to add two more options to the install option that would allow installing a package given a URL and another to install the development version of a package. As with the option to automatically solve dependencies, and for the same reasons, it should be easy to implement the URL. CD argueed that the dev option should not be implemented because it would stop packages from being released as users become more used to it and start installing possibly broken packages. CF said it would still be very useful for package maintainers preparing a new release. JC suggested to use releasePKG() on the admin section which already does it. It requires for a local clone of the repository which should already be available if it is for a developer preparing and testing a new package release. It was agreed that the url, but not the dev option would be added to pkg().

      CF and JC were of the opinion that the package system should not support both local and global installs and that all installations performed by pkg() should be local. CF reported that on Mac systems global installs were made local even when running octave with sudo. CD mentioned that on Windows systems the opposite happens, and all installs are global (such being checked with isppc() on the code). The two types of installations are exclusive to Linux systems. CF and JC said that global installs should be kept for the distribution maintainers and pkg() should deal with local installs only. CD argueed that this would mean that system administrators, even the ones compiling the latest octave version, would be dependent on their distro maintainers for obtaining the latest version of packages. CF and JC replied that supporting both types complicates the system and that packages are more user specific. It was agreed that the option was then going to be removed. After discussing this option with Jordi Hermoso, it was discovered that at least the Debian maintainers actually use pkg() to prepare the packages. It was then decided that pkg() would deal with both installation types.

      All CD, CF and JC were of the opinion that the -local and -global install flags were still useless and should be removed since the type of installation was already being decided based on whether the user was root, this flags only useful to force the other type.  CD proposed changing the default for a global installation if there was write permissions rather than being root as to permit an octave system user to make octave global installs. This also allows for a local installation of octave (a regular user compiling and installing octave on its home directory for example), to make a global package install. Global relative to the local octave installation, the packages on the octave tree rather than on a separate directory. This should allow for a cleaner organization. These two changes were made and commit before the end of OctConf2012.

      The current list of installed packages, local and global, is a simple struct saved in the -text format. CD was of the opinion this should be made a binary format to discourage users from manually editing the file and accidentally breaking the system. CF argued the opposite, that such editing may be necessary. It was decided to simply leave a warning on the file header.

      CD noticed that it is not possible to use packages in a system that has more than one octave version installed. While .m functions will work normally, .oct files compiled at installation time are version specific and will therefore fail. These are placed in API specific directories to avoid their incorrect loading but reinstalling the package removes them, forcing a reinstallation of the package everytime a different octave version is to be used. CF also pointed out that a system to perform reinstalls should be made and the packages source kept so as to reinstall packages with new octave versions. CD noted that this would also allow for use of %!test of C++ functions after install. Similarly, it was noted that currently is not possible to have more than one version of the same package installed.

      List of conclusions:
      • dependencies on other packages should be automatically solved
      • pkg() will not load packages automatically
      • an option to install packages given a URL will be added
      • the source of installed packages will be kept in disk for times installations
      • it will be possible to have multiple package lists that can be merged or replaced
      • support for different packages version and different octave versions will be added
      • pkg() will stay written in the Octave language
      • the -local and -global options will be removed
      • a header will be added to the octave list files warning that they should not be manually edited

      by Carnë Draug ( at November 06, 2012 04:12 PM

      November 05, 2012

      Max Brister

      JIT, Debugging, and Interrupts

      I finally found some time to work on Octave last weekend. There has been some talk on the mailing list recently about releasing a new version of Octave, so I figured I should clean up a few loose ends in JIT.


      Up until now JIT has been skipping breakpoints. While there are several issues with supporting breakpoints in JIT, the biggest one is that the Octave debugger allows for the execution of arbitrary statements. Arbitrary statement execution is a very powerful and useful feature of the Octave debugger, but it allows users to change the type of a variable in the middle of code execution.

      JIT gets its performance improvement by making assumptions about variable types, so entering debug mode means we need to exit JIT. I took the simple way out here and do not start JIT if there are any breakpoints in the code (see hg).


      In Octave if you hit control-c Octave will stop execution and return to the Octave prompt (unless debug on interrupt is set). The interpreter does this by calling octave_quit, which checks the interrupt state and throws an octave_interrupt_exception if an interrupt has occured.

      Ideally, to support interrupts in JIT a call to octave_quit should be inserted once per loop iteration. Unfortunately, it is not that simple. After an interrupt occurs the current variable values need to be reported to the interpreter. For example,
      i = 0;
      while 1
      i += 1;
      If the user interrupts the loop the interpreter needs to save the current value of i. This means JIT needs a way to catch and rethrow the octave_interrupt_exception. While LLVM does have a way of handling exceptions, the infrastructure in Octave does not yet exist to support the LLVM feature.

      Instead, I inserted a check to octave_interrupt_state. If octave_interrupt_state is greater than 0, we need to exit to the Octave prompt. I reused the code for checking error_state to achieve this.

      Now that JIT handles interrupts and breakpoints in a manner consistent with the interpreter, I can't think of ways in which JIT and the interpreter differ (besides speed). The amount of code which JIT can compile is still fairly limited. Hopefully, I will get some time over winter break to make it easier to extend JIT and improve what JIT can compile. In its current state JIT should be ready to include as an experimental feature in the next Octave release.

      by Max Brister ( at November 05, 2012 12:10 AM

      October 27, 2012

      Octave Forge

      less holidays to code for Octave

      Octave Forge has many unmaintained packages. Way more than the official list. Actually, the most unmaintained packages are listed as maintained since no one has even bothered to even update their status.

      Still, unmaintained packages receive the occasional bug report, sometimes with a patch. The latest was for holidays() from the financial package. These are the best. It means that not only someone is using the code, but also that they care enough to try and fix it.

      On the opposite side of the spectrum there's things such as this commit (I'm actually a bit ashamed of it). It introduced a huge bug that almost anyone using xcorr2() should have noticed immediately. But no one did. I mean, it was not issuing an incorrect result or anything difficult to notice, it was giving a very noticeable "error: invalid type of scale none". It made xcorr2() almost useless. But it was released with signal-1.1.2 (2012-01-18) and fixed only 8 months later without anyone ever complaining.

      Anyway, back to the bug in holidays(). I applied the second patch from the reporter. Even though I don't care about this function at all and the patch did fix this problem, I spent some time looking at the code. Not that it was complicated, quite the opposite, but I had never dealt with dates in Octave before. And I learned something new.

      This function, kind of returns the dates of holidays that close the NYSE (New York Stock Exchange). What I did not knew was that when a holiday falls on a Saturday or Sunday they are shifted to Friday or Monday respectively. Thus I definitely did not knew the exception to this rule. When the shift would move the holiday to another year there's no holiday at all (the only case of this is when New Year's day falls on a Saturday). And that was exactly the origin of the problem.

      A quite esoteric issue for someone like me. It is fixed now. Matlab compatibility, documentation and performance were increased, new tests were added, some of my hours were lost, and new useless knowledge was gained. Unfortunately,  according to the fixed holidays() people working at the NYSE now have less holidays to code for Octave. I'm sorry. And I should probably be writing my thesis instead.

      by Carnë Draug ( at October 27, 2012 11:05 PM

      Wendy Liu

      Agora Octave: Update #11

      This past week has been full of midterms and assignments so this update will be shorter than usual. I made a few small changes (feature additions and bug fixes) to the bundle feature, mostly based on feedback from the mailing list.

      This week

      The full commit log is available here:

      • If an uploaded bundle contains a file named DESCRIPTION, and the bundle's description is empty, and the bundle is indicated to follow Octave packaging formatting standards, then the contents of that file should be used as the description (commit 184, commit 187)
      • If the user has already created a bundle with the specified name, then the form should display a validation error rather than 500'ing (commit 183)
      • Fixed the submit button text in the bundle form (commit 182)

      Next week

      I'll continue working on bundles and housekeeping changes (see last week's update). I'll also look into documentation integration (suggested by Carlo de Falco via the mailing list).

      In other news, I got an email from ESA a few days ago informing me that I have successfully completed SOCIS 2012. Much thanks to my mentor Jordi Gutiérrez Hermoso for all of his help as well as the positive feedback. I will continue to work on Agora Octave after the end of the official coding period (which I just realised is tomorrow, making this my last official blog post :() for as long as it takes to get the project in a usable state, although the frequency of my blog posts may decrease as I get closer to exam period. I'm also interested in eventually getting more involved with Octave as a whole, and hopefully even contributing to Octave itself one day — at the very least, it would be a great opportunity to learn C++. This will probably have to wait until next semester, or at least until I finish my exams (middle/end of December).

      by dellsystem (Wendy Liu) at October 27, 2012 04:00 AM