Planet Octave

April 18, 2014

Wendy Liu

Showing only additions and deletions in git

If you've ever been in a situation where you wanted to see only the additions, or only the deletions, in a git diff, then you probably know that it's not as simple as running git diff --diff-filter=A or git diff --diff-filter=D. At least, not if you're looking for added or deleted lines, as opposed to files.

From the man page for git diff:


Select only files that are

* A Added
* C Copied
* D Deleted
* M Modified
* [omitted]

That's pretty cool, but this only works on files. So this would let us see only new files, or only deleted files. But what about lines? What if you only want to see the lines that have been added (or deleted) in one file?

The solution

Unfortunately, git does not appear to have an option for this built-in. On the bright side, if you run a *NIX system, you can do this:

git diff | grep ^+

This gets you all the additions. To see all the deletions:

git diff | grep ^-

Note that the output of the first command will also give you something like

+++ b/filename

for each affected file. If this is a problem, you can alter the regex to account for this (or just delete it manually).

You can of course pass other parameters to git diff, such as the specific commit or commit range you want to see the diff for, or the filename(s).

The use case

So why would anyone ever need to do something like this?

If you've ever been to a Model United Nations conference, you probably know that in exchange for paying a nominal registration fee and giving up your weekend to try to convince someone (who is not actually a representative of the United States) that you (someone who is not actually a representative of Iran) have absolutely no intention of producing nuclear weapons, definitely not, let's go out for a drink and I'm sure we can clear up this whole misunderstanding, you get a nice shiny badge that says your name, your school, and the country that you are supposed to pretend to represent, which looks something like this:

expanding acronyms is hard

For the last year or so, I was part of the organising committee for a Model United Nations conference that is sadly not named BADPUNS. As the USG-IT, one of my responsibilities was to take the registration information that was provided by conference attendees through the website and send it to the person in charge of printing badges. Since the website was completely custom-built and I didn't feel like writing a ton of code to make this into an actual feature (since, theoretically, it would only have to be used once), I just wrote a simple Django management command for exporting the information for the badges as a CSV file.

It worked fine, except for the fact that it was a week before the conference and some of the attendees still had not filled in their names. But we needed to get at least some of the ~1400 badges printed now, so I sent off what I had, and those badges got printed.

Two days later, when I exported the new badge list, I used git to figure out the difference between the current badges.csv and the previous one. The difficulty arose from the fact that not only did we have new badges to print (because some attendees didn't fill in their information before), we had badges to get rid of as well (because some attendees had last-minute cancellations). So I needed to use the exported badges.csv file to generate a new CSV containing just the new badges to be printed, as well as one listing the badges to be discarded.

That's where git and grep came to the rescue. To get rid of the + and - at the beginning of each line, I just used Vim's visual block mode. Much easier than manually checking each name, which is probably what I would have done if I didn't have git in my life.

There may actually be a way of doing this with just git, but I haven't found it. This might be a pretty hacky solution, but hey, it works.

Know of a better solution? Do tell! I'm @dellsystem on Twitter.

by dellsystem (Wendy Liu) at April 18, 2014 04:00 AM

April 16, 2014

Jordi Gutiérrez Hermoso

Python is an excellent lingua franca

I just spent 5 days at PyCon 2014 here in Montréal (3 days for the actual conference, 2 days sprinting), and wow, what a great conference that was.

There are many things I want to praise about the whole experience. The venue was great, the organisation was superb, the talks were interesting, the infrastructure was amazing, the atmosphere was friendly… but most of all, I think I want to praise the entire culture of inclusiveness that the Python community is trying to promote.

It is interesting that the only true common thread at the conference was a programming language (and not even that, sometimes, some of the talks were hardly about the Python programming language at all). Python was originally conceived as a programming language that was meant to be as easy as possible to understand. Whether it has succeeded from a purely language-design point of view is hard to say, and not everything about Python is great. The language has its gotchas here and there, just like any other language. And yet, despite not being a perfect language programming language, it’s able to bring together such a diverse group of individuals together to accomplish common goals.

Python is an excellent programming lingua franca for everyone, not just for Unix nerds (witness: Windows support is taken seriously) and not just for programming geeks (witness: Software Carpentry). Just take a look at the wide range of topics covered in the talks. General software development, the benefits of software freedom, cryptography and security (lol, heartbleed)…

Of particular note is that 1/3 of all attendees and speakers were female. Can any other tech conference boast such inclusiveness of the usually tech-neglected half of humankind? Look at all the talks related to gender issues: sexism in rap lyrics via machine learning, being a transgender Python hacker, or how to bring more Python to girls in school.

Now, to be clear, I don’t think that PyCon has eliminated sexism or that we have “won” this battle. As I overheard someone say, PyCon will not be inclusive enough for women unless the lines for the women’s bathroom are as long as the lines for the men’s. And there are still many issues, such as women still being “invisible” and ignored, or as I overheard someone else say, she actually had to say to some guy to look up from her breasts while she was talking to him. It’s not there all the way yet.

This just seems like a good start. I hope next year at PyCon 2015, we’ll be able to get 50% women attendees and speakers!

by Jordi at April 16, 2014 02:14 PM

March 06, 2014



Our Google Summer of Code project ideas got some promotion at Universiteit Gent (Belgium) this week.

Student applications to Google open on the 10th. The 21st is the application deadline. If you’re interested in applying to work on Octave, see here first.

by Nir at March 06, 2014 09:04 PM

February 26, 2014



Octave has been accepted into Google Summer of Code, for our first time as an independent organization! Student applications are due March 21, and decisions about accepting students will be made in early April.

To make GSoC a success, we need not only strong student programmers but also committed mentors to supervise the students and work with them to have contributions be useful for Octave. This can be tough to achieve because mentors are unpaid (though Google may send you a nifty T-shirt).

Each project should have a primary mentor and a backup mentor. In my experience, primary mentors should plan to devote 5-10 hours per week to the project, including speaking with the students and reviewing their code. Backup mentors have a smaller time commitment but still should keep up with project progress and be available to step in when the primary mentor is otherwise occupied.

If you’d like to be a project mentor:

1) Go to the wiki projects page and put your name next to projects you’d be interested in mentoring. Feel free to improve the project description at the same time.

2) Sign up as a mentor for Octave on the GSoC Melange site.

3) Start interacting with student applicants who have similar topics of interest.

by Nir at February 26, 2014 07:36 PM

February 01, 2014

Jordi Gutiérrez Hermoso

Poutine bought with bitcoins

I am cautiously hopeful for bitcoin. I just ate pizza and poutine with a group of friends, paying at the restaurant with bitcoins!

Acquiring the bitcoins

I am not a speculator. I am not an investor. I am not a miner. I am not trying to get rich nor make any money whatsoever by manipulating bitcoins. I just want to be able to earn and spend money on the internet without needing the permission of a bank, or the permission of Paypal, or making it anyone’s business but mine and the person I’m actually doing business with.

I acquired some bitcoins about a year ago, at the time it was about 1.8 bitcoins worth around 30 USD. I did not buy them. I did not invest on bitcoins. I did not mine them. I received them as a payment for tutoring mathematics in an IRC channel for an hour and a half. I earned these bitcoins through my labour, just like I would earn any other currency.

At the time there was not much I could do with bitcoins. Although I freely encouraged and accepted the payment in bitcoins, I did it more with amusement than conviction. I am not a criminal, but since at the time Silk Road still had a sort of forbidden underground allure, my client suggested that I could use the bitcoins to buy illegal drugs on the black market. I had no intention to do so, but I decided to keep the bitcoins around, as a cautious hope I could use them for something else.

First purchases

I kept the bitcoins for quite some time, since I could not find anything to spend them on. The first thing I could find that seemed interesting was a 4chan “account” (in reality, just an exemption from its captcha). Despite the website’s bad reputation as being the cesspool of the internet, I think that there is value in its pro-anonymity ethos, something that is slowly being eroded away in today’s online world. This was also my first test case for the possibility of anonymous online currency. In keeping with this ethos, I made the payment, and when there was a slight hiccup with the transaction, used a throw-away email address to resolve the issue. No chargebacks, but people are still people and are still mostly nice. I thus obtained my 4chan captcha exemption. I have not really used it since then, but I am satisfied knowing that I have made a small contribution towards promoting anonymity.

I kept my eye out for news stories of more opportunities to spend bitcoins on. The currency seemed to be slowly gaining more adoption, especially for non-material goods and services. My next purchase was one month of reddit gold as a gift for a particularly witty commentator. These two purchases together had already given a significant blow to my bitcoin balance, but I was not duly concerned. After all, this was just pocket change I acquired for 90 minutes of a hobby task.

Then, suddenly, over a couple of weeks the bitcoin price exploded from 20 USD per bitcoin to over 1000 USD per bitcoin. I didn’t exactly become a millionaire, but my paltry fraction of a bitcoin now had the power to buy more things.

Bitcoin starting to look like a possibility

I made a few more purchases. A PDF copy of Julian Assange’s et al Cypherpunks book. A month of VPN access. Sent bitcoins to a kind stranger on the internet in exchange for digital albums from indie band Nectarphonic. When started selling gift cards to in exchange for bitcoins, I obtained my first physical product for bitcoins: a boxed set of Susan Collin’s The Underland Chronicles.

This really had just been my time getting used to bitcoin and how it works. The fluctuations in price, the cryptography behind it, the UI presented first by the “official” bitcoinqt client and understanding more how the alternative bitcoin client Electrum works. The security model, what it means to “own” bitcoins. Passwords and private keys. The workings of the blockchain. Using Tor for enhanced anonymity of transactions.

I finally got around to reading Satoshi Nakamoto’s whitepaper. Wow. I don’t know if this person or group is a genius or just a child of the times, but Nakamoto seems to have solved a cryptography protocol problem that nobody had solved before. Nakamoto didn’t invent the idea of cryptocurrencies, but merely built upon the ideas of others in order to build a decentralised system. I was duly impressed.

Then two days ago I saw that the first restaurant in Montréal was proclaiming to accept bitcoins. I could now buy Québec’s signature dish, poutine, with bitcoins! Excited, I started making plans.

Food for bitcoins

I eagerly announced yesterday on IRC that I was planning to go to this restaurant to try out my bitcoins. Another chatter local to Montréal decided to tag along for fun. We made plans to go that very evening for a night of socialising, three couples. I eagerly explained to everyone my excitement over finally being able to pay someone face-to-face with bitcoins.

My fiancée and I got to Montréal Poutine in the Old Port a few minutes earlier than everyone else. It was a small location, hardly more than a greasy spoon, but par for the course for a poutine joint. There were signs all over the establishment announcing the possibility of paying with bitcoins and a wifi password on their chalkboards.

We eyed the menu, and I nervously made a few preliminary checks to ensure the transaction would go smoothly. Due to one of my many quirks, I do not own a pocket computer (“smartphones”, neither smart and hardly phones), so I was prepared to pay with my laptop and a webcam for scanning a QR code. I ensured that the internet connection was stable and that I could scan a QR code. As instructed in (or in, because this is .ca), I proudly announced to my server that I was going to pay with bitcoins. “Oh, wow, it’s my first time,” she said in French. “Mine too,” I replied with a smile.

Our company arrived, the two other couples. We ordered various kinds of poutine, pizza, avocado fries, beer and mineral water. We chatted about many things, and soon predictable discussions about bitcoins ensued. Their method of operation, their security, their volatility but recent relative stability. Musings if they would ever work or not. The more techy in the bunch were showing signs of optimism, while those who found the idea more foreign were predicting the eventual doom for bitcoins.

After a very pleasant evening, it was time to pay the bill.

The moment of truth

I announced that I had enough bitcoins to pay for all the food, but I asked everyone else to divvy up the drinks between them. I also told them that I had read that tips were not accepted in bitcoins yet, so that would have to be paid elsehow.

I readied my laptop and the webcam, to the growing amusement of my companions. The server came with one bill for food and another for drinks. I reminded her that I was going to pay with bitcoins, but I wasn’t quite sure what to expect. I had seen videos online of handheld machines that displayed a QR code with the bitcoin address to pay to, and that was my guess for what I would see. Instead, she pointed me to a static QR code pasted next to the cash register. Was this the bitcoin address to pay to? I walked with my baroque laptop-and-webcam rig over to the cash register, in good fun as I heard my giggling friends behind me.

I used Electrum to scan the QR code, looking for the address. Instead of a bitcoin address, this was a Bitpay URL. I wasn’t quite sure what to do with it, and I floundered for a few seconds. I opened a generic QR code scanner to get the URL. One of the guys with us helpfully walked over to help me scan the QR code, but I had already managed to get the URL loaded in Firefox by this time. The server was re-reading the instructions next to the QR code on how to pay.

At the Bitpay URL, there were two fields to fill out: an order number and the amount in CAD to pay. The server read the instructions again and said to leave the order number blank. I filled in the amount with taxes, showed it to her, and we agreed it was correct. I clicked enter. A bitcoin address showed up. I went back to Electrum to send money to that address. I stumbled when typing my wallet password. The server, getting the hang of the transaction, said bemusedly, “I cannot help you with that.” Finally I got it, and the familiar “this invoice has been paid” message showed up on the Bitpay website.

A few seconds passed, and the restaurant staff confirmed on another computer that they had received the payment. I showed everyone at the table the paid invoice on my laptop screen. A few other customers in the restaurant were showing some interest. I announced to everyone, “it’s ok, bitcoin has worked, I made the payment!”

Everyone relaxed and proceeded to tackle the more familiar problem of paying for the drinks with more conventional currency.

Future of bitcoins

I don’t know if bitcoin is going to crash or not. I don’t understand enough economics to comment on what the value of bitcoins is, or if it really is different from the value that we give to any other currency. I have a rough understanding of the cryptography behind it and what makes the whole thing work. I know people are working on making bitcoins more accessible to people less enthusiastic than me, and I very much wish they succeed.

All I know is that so far bitcoins have worked for me. I look forward to getting better acquainted with this new way to conduct business. I wish good fortune to everyone else who is also trying to build a new currency for an internet-powered age.

by Jordi at February 01, 2014 09:51 PM

January 08, 2014

Jordi Gutiérrez Hermoso

Polling for OctConf 2014

So, we want to know if this is going to work out.

#yop-poll-container-2_yp535040d00a24e { width:200px; background:#555; padding:10px; color:#fff; overflow:hidden; font-size:12px; } #yop-poll-name-2_yp535040d00a24e { font-size:14px; font-weight:bold; } #yop-poll-question-2_yp535040d00a24e { font-size:14px; margin:5px 0px; } #yop-poll-answers-2_yp535040d00a24e { } #yop-poll-answers-2_yp535040d00a24e ul { list-style: none outside none; margin: 0; padding: 0; } #yop-poll-answers-2_yp535040d00a24e ul li { font-style:normal; margin:0px 0px 10px 0px; padding:0px; font-size:12px; } #yop-poll-answers-2_yp535040d00a24e ul li input { margin:0px; float:none; } #yop-poll-answers-2_yp535040d00a24e ul li label { margin:0px; font-style:normal; font-weight:normal; font-size:12px; float:none; } .yop-poll-results-2_yp535040d00a24e { font-size: 12px; font-style: italic; font-weight: normal; margin-left: 15px; } #yop-poll-custom-2_yp535040d00a24e { } #yop-poll-custom-2_yp535040d00a24e ul { list-style: none outside none; margin: 0; padding: 0; } #yop-poll-custom-2_yp535040d00a24e ul li { padding:0px; margin:0px; font-size:14px; } #yop-poll-container-2_yp535040d00a24e input[type='text'] { margin:0px 0px 5px 0px; padding:2%; width:96%; text-indent:2%; font-size:12px; } #yop-poll-captcha-input-div-2_yp535040d00a24e { margin-top:5px; } #yop-poll-captcha-helpers-div-2_yp535040d00a24e { width:30px; float:left; margin-left:5px; height:0px; } #yop-poll-captcha-helpers-div-2_yp535040d00a24e img { margin-bottom:2px; } #yop-poll-captcha-image-div-2_yp535040d00a24e { margin-bottom:5px; } #yop_poll_captcha_image_2_yp535040d00a24e { float:left; } .yop_poll_clear { clear:both; } #yop-poll-vote-2_yp535040d00a24e { } .yop-poll-results-bar-2_yp535040d00a24e { background:#f5f5f5; height:10px; } .yop-poll-results-bar-2_yp535040d00a24e div { background:#333333; height:10px; } #yop-poll-vote-2_yp535040d00a24e div#yop-poll-vote-2_yp535040d00a24e button { float:left; } #yop-poll-vote-2_yp535040d00a24e div#yop-poll-results-2_yp535040d00a24e { float: right; margin-bottom: 20px; margin-top: -20px; width: auto; } #yop-poll-vote-2_yp535040d00a24e div#yop-poll-results-2_yp535040d00a24e a { color:#fff; text-decoration:underline; font-size:12px;} #yop-poll-vote-2_yp535040d00a24e div#yop-poll-back-2_yp535040d00a24e a { color:#fff; text-decoration:underline; font-size:12px;} #yop-poll-vote-2_yp535040d00a24e div { float:left; width:100%; } #yop-poll-container-error-2_yp535040d00a24e { font-size:12px; font-style:italic; color:red; text-transform:lowercase; } #yop-poll-container-success-2_yp535040d00a24e { font-size:12px; font-style:italic; color:green; }
OctConf 2014
Can you come to Montréal, Canada, in the summer of 2014 for OctConf?

by Jordi at January 08, 2014 02:59 PM

January 05, 2014

Roberto Porcù

December 15, 2013

Roberto Porcù

RATTLE algorithm


Constrained mechanical systems form an important class of differential equations on manifolds. For all the theory that I present here and I've used for the implementation I refer to "Geometric Numerical Integration" by Hairer, Lubich and Wanner.

Consider a mechanical system described by position coordinates \(q_1,\dots q_d\) and suppose that the motion is constrained to satisfy \(g(\mathbf q) = \mathbf 0\) where \(g:\mathbb R^d \rightarrow \mathbb R^m \), with \(m \lt d \). So that, the equations of motion governing the system become: $$ \left\{ \begin{array}{l} \dot{\mathbf q} = \frac{\partial H}{\partial \mathbf p} \\ \dot{\mathbf p} = -\frac{\partial H}{\partial \mathbf q} - G(q)^T\lambda \\ g(\mathbf q) = \mathbf 0 \end{array}\right. $$ where \(G(\mathbf q) = \dfrac{\partial g(\mathbf q)}{\partial \mathbf q} \).

Symplectic Euler method can be extended to constrained systems but we focus on SHAKE and RATTLE algorithms. SHAKE is a 2-steps algorithm so that, since I'm implementing only 1-step algorithms and the overall structure of solvers and integrators is made for 1-step solvers, I implemented just RATTLE algorithm.


The RATTLE algorithm implemented works with any general Hamiltonian \(H(\mathbf q,\mathbf p \) and is defined as follows: $$ \left\{\begin{array}{l} \mathbf p_{n+\frac{1}{2}} = \mathbf p_n -\frac{h}{2}\big(H_{\mathbf q}(\mathbf q_n,\mathbf p_{n+\frac{1}{2}}) + G(\mathbf q_n)^T \mathbf {\lambda}_n \big) \\ \mathbf q_{n+1} = \mathbf q_n +\frac{h}{2} \big( H_{\mathbf p}(\mathbf q_n,\mathbf p_{n+\frac{1}{2}}) + H_{\mathbf p}(\mathbf q_{n+1},\mathbf p_{n+\frac{1}{2}}) \big) \\ g(\mathbf q_{n+1}) = \mathbf 0 \\ \mathbf p_{n+1} = \mathbf p_{n+\frac{1}{2}} -\frac{h}{2}\big(H_{\mathbf q}(\mathbf q_{n+1},\mathbf p_{n+\frac{1}{2}}) + G(\mathbf q_{n+1})^T \mathbf{\mu}_n \big) \\ G(\mathbf q_{n+1}) H_{\mathbf p}(\mathbf q_{n+1},\mathbf p_{n+1}) = \mathbf 0 \end{array}\right. $$ where \( h=\Delta t=t_{k+1}-t_k\) and \(\mathbf{\mu}_n \),\( \mathbf{\mu}_n \) are Lagrangian multipliers nedded to impose the constraints.

It can be demonstrated that this numerical method is symmetric, symplectic and convergent of order 2.

The following code represent it's implementation:

function [t_next,x_next,err]=rattle(f,t,x,dt,options)
H = odeget(options,'HamiltonianHessFcn',[],'fast');
GG = odeget(options,'ConstraintHessFcn',[],'fast');
if( ~isempty(H) && ~isempty(GG) )
fsolve_opts = optimset('Jacobian','on');
fsolve_opts = optimset('Jacobian','off');
g = odeget(options,'ConstraintFcn',[],'fast');
G = odeget(options,'ConstraintGradFcn',[],'fast');
c_nb = odeget(options,'ConstraintsNb',[],'fast');

dim = length(x)/2;
q0 = x(1:dim);
p0 = x(dim+1:end);

RATTLE = @(y)constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt);
y0 = [q0;p0;p0;zeros(2*c_nb,1)];
y0 = fsolve(RATTLE,y0,fsolve_opts);
t_next = t+dt;
x_next = [y0(1:dim);y0(2*dim+1:3*dim)];

dt = dt/2;
RATTLE = @(y)constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt);
y0 = [q0;p0;p0;zeros(2*c_nb,1)];
y0 = fsolve(RATTLE,y0,fsolve_opts);

q0 = y0(1:dim);
p0 = y0(2*dim+1:3*dim);
t = t+dt;

RATTLE = @(y)constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt);
y0 = [q0;p0;p0;zeros(2*c_nb,1)];
y0 = fsolve(RATTLE,y0,fsolve_opts);

x_est = [y0(1:dim);y0(2*dim+1:3*dim)];
err = norm(x_est-x_next,2);

function [F,J] = constr_sys(y,dim,c_nb,f,H,g,G,GG,t,q0,p0,dt)
F = zeros(3*dim+2*c_nb,1);
F(1:dim) = y(1:dim) - q0 - (dt/2).*(f(t,[q0; ...
y(dim+1:2*dim)])(1:dim) + f(t+dt,y(1:2*dim))(1:dim));
F(dim+1:2*dim) = y(dim+1:2*dim) - p0 - (dt/2).*(f(t,[q0; ...
y(dim+1:2*dim)])(dim+1:end) - G(q0)'*y(3*dim+1:3*dim+c_nb));
F(2*dim+1:3*dim) = y(2*dim+1:3*dim) - y(dim+1:2*dim) - ...
(dt/2)*(f(t+dt,y(1:2*dim))(dim+1:end) - ...
F(3*dim+1:3*dim+c_nb) = g(y(1:dim));
F(3*dim+c_nb+1:end) = G(y(1:dim))*(f(t+dt,[y(1:dim); ...

if( nargout==2 )
J = zeros(3*dim+2*c_nb,3*dim+2*c_nb);
J(1:dim,1:dim) = eye(dim) - ...
J(1:dim,dim+1:2*dim) = -(dt/2)*(H(t,[q0; ...
y(dim+1:2*dim)])(dim+1:end,dim+1:end) + ...
J(dim+1:2*dim,dim+1:2*dim) = eye(dim) + ...
J(dim+1:2*dim,3*dim+1:3*dim+c_nb) = (dt/2)*G(q0)';
J(2*dim+1:3*dim,1:dim) = (dt/2)*(H(t+dt, ...
for k = 1:1:c_nb
J(2*dim+1:3*dim,1:dim) = J(2*dim+1:3*dim,1:dim) - ...
J(2*dim+1:3*dim,dim+1:2*dim) = -eye(dim) + ...
J(2*dim+1:3*dim,2*dim+1:3*dim) = eye(dim) + ...
J(2*dim+1:3*dim,3*dim+c_nb+1:end) = (dt/2)*G(y(1:dim))';
J(3*dim+1:3*dim+c_nb,1:dim) = G(y(1:dim));
J(3*dim+c_nb+1:end,1:dim) = G(y(1:dim))* ...
for k = 1:1:c_nb
J(3*dim+c_nb+k,1:dim) = J(3*dim+c_nb+k,1:dim) + ...
((GG(y(1:dim))(:,:,k))*(f(t+dt,[y(1:dim); ...
J(3*dim+c_nb+1:end,2*dim+1:3*dim) = G(y(1:dim))* ...
(H(t+dt,[y(1:dim);y(2*dim+1:3*dim)]) ...
It works with any number of constraint, unless this is equal or greater to system dimension. As usual, all the source code is available at my public repository octave-odepkg.

by Roberto Porcù ( at December 15, 2013 03:47 PM

December 09, 2013

Roberto Porcù

Steppers and Integrate functions

In this first post I want to explain the organization of the code that I'm going to implement. The main idea is to have a structured organization and to subdivide the code according to the most important operations which must be executed. So that should be easier to optimize the bottlenecks and to extend the code with new functionalities.

We will have two main actors: the steppers and the integrate functions.


A stepper will be a function and will represent the numerical method used for the integration. Its job is to execute just one integration step and its signature will be:

[x_next,err] = stepper(f,x,t,dt)

x_next is the solution at the next step, err is an estimation of the error (obtainable for example with Richardson Extrapolation), f is a function handle representing the equations to be integrated, x is the solution at the current time t and finally dt is the time step.

Inside this function I'll check if err is really requested by means of nargout keyword. The estimation of the error will be useful to determine the optimal dt in adaptive integrators.

To set the parameters for both the numerical method and the solver I'll use odeset and odeget which are not embedded into Octave but are functions of Odepkg. My intent is to edit these functions in order to make them suitable for my parameters setting.

Integrate Functions

An integrate function will be the function executing the algorithm of integration on more steps. We will have different integrate functions depending on how we want to obtain the solution. For example we can have:

  • integrate_const(stepper,f,x0,t0,t1,dt) integrating from t0 to t<=t1 with a fixed dt;
  • integrate_n_steps(stepper,f,x0,t0,dt,n) integrating from t0 for n integration steps with fixed dt;
  • integrate_adaptive(stepper,p,f,x0,t0,t1,dt) integrating from t0 to t1 with an adaptive timestep, where p is the order of the stepper.

A simple example: Forward Euler

In this part I'll show you an example of two simple steppers (fe_richardson and fe_heun). Both the steppers make use of the Forward Euler method to find the new solution: $$x_{k+1} = x_{k} + h f(t_{k},x_{k})\ .$$ They differ in the way the error is estimated: fe_richardson makes use of Richardson Extrapolation while fe_heun makes use of Heun-Euler method.

This is the implementation of the fe_richardson stepper:

function [t_next,x_next,err] = fwe_richardson(f,t,x,dt,options)
x_next = x + dt.*f(t,x);
t_next = t+dt;
if(nargout == 3)
x1 = x + (.5*dt).*f(t,x);
x_est = x1 + (.5*dt).*f(t+ .5*dt,x1);
err = norm(x_next - x_est,2);

The Heun-Euler method combines the Heun method, which is of order 2, with the Euler method, which is of order 1. This combination allows to get an estimation of the local error. We can write the Heun method as $$\begin{aligned} \tilde{x}_{k+1} &= x_{k} + h f(t,x_{k}) \\ x_{k+1}^{*} &= x_{k} + \frac{h}{2} (f(t,x_{k}) + f(t+h,\tilde{x}_{k+1})) \end{aligned}$$ So that the error estimation is given by: $$ e_{k+1} = \tilde{x}_{k+1} - x^*_{k+1}=\frac{h}{2}(f(t+h,\tilde{x}_{k+1}) - f(t,x_{k}))$$ This is the implementation of the fe_heun method:

function [t_next,x_next,err]=fwe_heun(f,t,x,dt,options)
x_next = x + dt.*f(t,x);
t_next = t+dt;
if(nargout == 3)
err = .5 * dt * norm(f(t+dt,x_next) - f(t,x),2);

The adaptive integrator

Finally I'll describe how integrate_adaptive is implemented. As i said this function does the integration on the total interval [t0,t1] adapting the timestep at each iteration, by means of the error estimation returned by the stepper. This function takes as input:

  • stepper: the stepper;
  • p: the order of the stepper;
  • f: the function to be integrated;
  • x0: the initial condition;
  • t0, t1: the extremes of the time interval;
  • dt: the first timestep.
While the actual time is less or equal to the final instant, this function recursively calls the stepper and then updates the solution, the time and the next timestep. Finally, since the very last instant may be greater than t1, the function substitutes the last element of the solution with a linear interpolation calculated in t1 between the last two elements of the solution.

function [t,x] = integrate_adaptive(stepper,order, ...
t = tspan(1);
x = x0(:);
dt = odeget(options,'InitialStep',[],'fast');
tau = 1.e-6;
counter = 2;
while( t(end) (tspan(end)-1.e-13) )
[s,y,err] = stepper(func,t(end),x(:,end),dt,options);
if( s(end) > (tspan(counter)+1.e-13) )
z = [t(end);s];
u = [x(:,end),y];
i = 1;
while( i=length(z) )
if( abs(z(i)-tspan(counter)) 1.e-13 )
x = [x,y];
t = [t;s];
i_old = i;
i = length(z)+1;
elseif( z(i) > (tspan(counter)+1.e-13) )
x = [x(:,1:end-1),u(:,1:i-1),u(:,i-1)+ ...
((tspan(counter)-z(i-1))/(z(i)-z(i-1)))* ...
t = [t(1:end-1);z(1:i-1);tspan(counter);z(i:end)];
i_old = i;
i = length(z)+1;
x = [x,y];
t = [t;s];
err += eps;
dt = dt.*(tau/abs(err))^(1.0/(order+1));
if( counter > max(size(tspan)) )
k = (length(z)-i_old)+1;
x = x(:,1:end-k);
t = t(1:end-k);
x = x';

The method used to calculate the next timestep is taken from the book of Shampine, Gladwell and Thomson. tau is a parameter which now is fixed but in the future will be setted by means of odeset and odeget functions.

How it will be at the end

At the end, functions like ode45 will be just aliases. For example, we define a function called odefwe which does the integration by means of Forward Euler method, adapting the timestep with Richardson Extrapolation. The call to this function will just hide a call to integrate_adaptive function in this way:

odefwe(f,x0,t0,t1,dt) = ...
In the same way we'll define a function called ode12 which does the integration in an adaptive way using the Heun-Euler method. So that, now we will have:
ode12(f,x0,t0,t1,dt) = integrate_adaptive(fe_heun,1,f,x0,t0,t1,dt)

by Roberto Porcù ( at December 09, 2013 08:37 AM

Odeset & odeget

Odeset & odeget

Odeset and odeget functions allow to build a valid ODE options structure. They are already available in Octave odepkg but they are not perfectly compatible with MATLAB odeset and odeget functions. Furthermore, for geometric integrators like Spectral Variational Integrators, I will need new options into the ODE options structure, which now are not admitted in Octave odepkg.

So that I have written my own versions of odeget.m, odeset.m and the new function ode_struct_value_check.m in order to have full compatibility with MATLAB odeset and odeget (their same behaviour on the basis of their official documentation) and also the possibility to add new field names which i will need for future solvers of this project.

The new fields are the union of MATLAB ODE options and Octave ODE options, plus my new options (default values are in square brackets):

'             AbsTol:  scalar or vector, >0, [1.e-6]'
' BDF: binary, {on, [off]}'
' Events: '
' Explicit: binary, {yes, no, []}'
' InexactSolver: switch, {inexact_newton, []}'
' InitialSlope: vector, []'
' InitialStep: scalar, >0, []'
' Jacobian: matrix or function_handle, []'
' JConstant: binary, {on, [off]}'
' JPattern: sparse matrix, []'
' Mass: matrix or function_handle, []'
' MassConstant: binary, {on, [off]}'
' MassSingular: switch, {yes, [maybe], no}'
'MaxNewtonIterations: scalar, integer, >0, [1.e3]'
' MaxOrder: switch, {1, 2, 3, 4, [5]}'
' MaxStep: scalar, >0, []'
' MStateDependence: switch, {none, [weak], strong}'
' MvPattern: sparse matrix, []'
' NewtonTol: scalar or vector, >0, []'
' NonNegative: vector of integers, []'
' NormControl: binary, {on, [off]}'
' OutputFcn: function_handle, []'
' OutputSave: scalar, integer, >0, []'
' OutputSel: scalar or vector, []'
' PolynomialDegree: scalar or vector, integer, >0, []'
' QuadratureOrder: scalar or vector, integer, >0, []'
' Refine: scalar, integer, >0, []'
' RelTol: scalar, >0, [1.e-3]'
' Stats: binary, {on, [off]}'
' TimeStepNumber: scalar, integer, >0, []'
' TimeStepSize: scalar, >0, []'
' Vectorized: binary, {on, [off]}'

The usage of odeset will be one of the following:

opt = odeset()
opt = odeset(old_ODE_options,new_ODE_options)
opt = odeset(old_ODE_options,'field1',value1,'field2',value2,...)
opt = odeset('field1',value1,'field2',value2,'field3',value3,...)

The usage of odeget will be one of the following:

option_value = odeget(ODE_structure,'field')
option_value = odeget(ODE_structure,'field',default_value)
option_value = odeget(any_struct,'field',default_value,'fast')
The last usage is needed for MATLAB compatibility and represents a fast access to the given structure with no error checking on options names.

Fuzzy string comparison

Users may do mistakes while typing ODE option names or they may want to write everything in low cases for a faster coding. So that my odeset and odeget functions behave in a user-friendly way, calling the fuzzy_compare function which determines the distance of each structure option name from the given word and returns the indices of those options whose distance is under a given tolerance.

The idea is that word cases are not relevant, if we want an exact matching then the tolerance will be 0, if no index is returned then it will definitely be a typing error, if one index is returned but words don't match exactly a warning will be displayed saying that the program is going on assuming that the user was intending the closest option name, otherwise all the fields whose distance is under the tolerance will be displayed and the user will be asked to insert the name again.

Signature of this function follows.

res = fuzzy_compare(string1,string_set,correctness)
  • string1 must be a string and is the option name we're looking for;
  • string_set must be a cell array of strings or a column vector of strings; it represents the set of option names;
  • correctness is the tolerance we want. It is an optional input argument;
We may set correctness equal to 'exact' or 0 and in this case fuzzy_compare will look only for exact matching; if we set correctness to any positive integer it will be the tolerance of the method (that is the maximum distance, from the given string, accepted). If we don't specify anything for correctness then the tolerance will be calculated as a function of the minimum distance and of the length of the given string: the less the distance, the less the tolerance; the greater the length, the greater the tolerance.
tolerance = minimus + floor(length(string1)/4)*floor(minimus/3)
There exist many definitions of distance between strings. I've chosen the Levenshtein distance.

Levenshtein distance

The Levenshtein distance is a string metric and is equal to the minimum number of single-characters edit (insertion, deletion and substitution) required to change one word into the other. This is the main algorithm of levenshtein function:

function [dist,d] = levenshtein(string1,string2)
m = length(string1);
n = length(string2);

d = zeros(m+1,n+1);
d(2:m+1,1) = [1:1:m];
d(1,2:n+1) = [1:1:n];

for j=2:1:n+1
for k=2:1:m+1

dist = d(m+1,n+1);

All the code is available under the terms of GNU General Public License at my_octave-odepkg.

by Roberto Porcù ( at December 09, 2013 08:32 AM

Backward Euler with Inexact Newton solver

Backward Euler

The backward Euler method is one of the most common methods used to solve ODEs. It is an implicit method of order one but its strenght lies in its A-stability property.

Given a system of ODEs of the form: $$\frac{d\mathbf{y}}{dt} = \mathbf{f}(t,\mathbf{y}) \\ \mathbf{y}(t_0) = \mathbf{y}_0\ ,$$ the backward Euler method is written as: $$ \mathbf{y}_{k+1} = \mathbf{y}_k + \Delta t\ \mathbf{f}(t_{k+1},\mathbf{y}_{k+1})\ .$$

If we define u to be the exact solution then we have that: $$ \frac{d\mathbf{u}}{dt} = \mathbf{f}(t,\mathbf{u}) \\ \mathbf{u}(t_k) = \mathbf{y}_k $$ So that $$\mathbf u(t_{k+1}) = \mathbf u(t_k) + \int_{t_k}^{t_{k+1}} \mathbf{f}(t,\mathbf u) dt\ ,$$ which, under some conditions of regularity, can be rewritten as $$\mathbf u(t_{k+1}) = \Delta t\ \mathbf f(t_{k+1},\mathbf u(t_{k+1})) - \frac{{\Delta t}^2}{2} \mathbf f'(\xi)\ .$$

From these results it can be derived that the local truncation error (LTE) is $$ \mathbf e_{k+1} = \mathbf y_{k+1} - \mathbf u(t_{k+1}) = \mathbf o({\Delta t}^2)\ .$$

Inexact Newton

Now we have to solve the non-linear system built up with the backward Euler method. We want to use an inexact Newton solver. The classical Newton method is written as: given a non-linear functions system F(x) = 0 with a given initial condition, solve the linearized system: $$ \mathbf F'(\mathbf x_k)\mathbf s_k + \mathbf F(\mathbf x_k) = \mathbf 0 \\ \mathbf x_{k+1} = \mathbf x_k + \mathbf s_k\ .$$ Go on iterating this method until a given tolerance is reached.

In many cases, especially when we don't have an explicit expression for the Jacobian or it can't be inverted, we can use an inexact Newton method: $$ || \mathbf F'(\mathbf x_k)\mathbf s_k + \mathbf F(\mathbf x_k) || \leq \eta_k || \mathbf F(\mathbf x_k)||\ ,$$ where \(\eta_k\) is said to be the forcing term. The linearized system can be solved with an iterative method such as GMRES or preconditioned conjugate gradient.

There are two possible choices for the forcing term:

  1. first choice $$ \eta_k = \frac{|| \mathbf F(\mathbf x_k)-\mathbf F(\mathbf x_{k-1}) -\mathbf F'(\mathbf x_{k-1})\mathbf s_{k-1} ||}{||\mathbf F(\mathbf x_{k-1}) ||}\ ;$$
  2. second choice $$ \eta_k = \gamma \Big(\frac{|| \mathbf F(\mathbf x_k) ||}{|| \mathbf F(\mathbf x_{k-1}) ||}\Big)^{\alpha}\ .$$

Choosing the forcing term with one of the two previous possibilities, rather than imposing a fixed tolerance, it is possible to avoid the phenomenon of oversolving. Another advantage is that if we use the GMRES as the linear solver it's not necessary to know the Jacobian nor to datermine it, because we can approximate the term $$ \mathbf F'(\mathbf x_{k})\mathbf s_k $$ with a first order finite differences formulae: $$ \mathbf F'(\mathbf x_k)\mathbf s_k \approx \frac{\mathbf F(\mathbf x_k + \delta\mathbf s_k) - \mathbf F(\mathbf x_k)}{\delta} \ .$$

The signature of inexact_newton is the same of fsolve:


by Roberto Porcù ( at December 09, 2013 08:28 AM

Spectral Variational Integrators

Variational integrators

The variational integrators are a class of numerical methods for mechanical systems which comes from the discrete formulation of Hamilton's principle of stationary action. They can be applied to ODEs, PDEs and to both conservative and forced systems. In absence of forcing terms these methods preserve momenta related to symmetries of the problem and don't dissipate energy. So that they exhibit long-term stability and good long-term behaviour.

Considering a configuration manifold V, the discrete Lagrangian is a function from V to the real numbers space, which represents an approximation of the action between two fixed configurations: $$ L_d(\mathbf q_k,\mathbf q_{k+1}) \approx \int_{t_k}^{t_{k+1}} L(\mathbf q,\dot{\mathbf q};t) dt \hspace{0.5cm}\text{with}\hspace{0.4cm}\mathbf q_{k},\mathbf q_{k+1}\hspace{0.3cm}\text{fixed.}$$ From here, applying the Hamilton's principle, we can get the Euler-Lagrange discrete equations: $$D_2L_d(\mathbf q_{k-1},\mathbf q_k) + D_1 L_d(\mathbf q_k,\mathbf q_{k+1}) = 0\ ,$$ and thanks to the discrete Legendre transforms we get the discrete Hamilton's equation of motion: $$\left\{\begin{array}{l} \mathbf p_k = -D_1 L_d(\mathbf q_k,\mathbf q_{k+1}) \\ \mathbf p_{k+1} = D_2 L_d(\mathbf q_k,\mathbf q_{k+1}) \end{array}\right.\ ,$$ so that we pass from a 2-step system of order N to a 1-step system of order 2N. This system gives the updating map: $$ (\mathbf q_k,\mathbf p_k)\rightarrow(\mathbf q_{k+1},\mathbf p_{k+1})\ .$$ For all the theory behind this I refer to "Discrete mechanics and variational integrators" by Marsden and West.

Spectral variational integrators

To create a spectral variational integrator I considered a discretization of the configuration manifold on a n-dimensional functional space generated by the orthogonal basis of Legendre polynomials. So that, after rescaling the problem from \([t_k,t_{k+1}]\) (with \(t_{k+1}-t_k=h\)) to \([-1,1]\), we get: $$ \begin{array}{l} \mathbf q_k(z) = \sum_{i=0}^{n-1}\mathbf q_k^i l_i(z)\\ \dot{\mathbf q}_k(z) = \frac{2}{h} \sum_{i=0}^{n-1}\mathbf q_k^i \dot l_i(z) \end{array} \ .$$ Then I approximate the action using the Gaussian quadrature rule using \(m\) internal nodes, so putting all together we have: $$ \int_{t_k}^{t_{k+1}} L(\mathbf q,\dot{\mathbf q};t)dt\hspace{0.5cm} \approx\hspace{0.5cm} \frac{h}{2}\sum_{j=0}^{m-1} \omega_j L\big( \sum_{i=0}^{n-1}\mathbf q_k^i l_i , \frac{2}{h} \sum_{i=0}^{n-1}\mathbf q_k^i \dot l_i \big) $$ Now imposing Hamilton's principle of stationary action under the constraints: $$ \mathbf q_k = \sum_{i=0}^{n-1}\mathbf q_k^i l_i(-1) \hspace{1.5cm} \mathbf q_{k+1} = \sum_{i=0}^{n-1}\mathbf q_k^i l_i(1)\ ,$$ we obtain the system: $$ \left\{ \begin{array}{l} \sum_{j=0}^{m-1}\omega_j \bigg[ \mathbf p_j \dot{l}_s(z_j) - \frac{h}{2} l_s(z_j) \frac{\partial H}{\partial \mathbf q} \bigg ( \sum_{i=0}^{n-1}\mathbf q_k^i l_i(z_j),\mathbf p_j \bigg ) \bigg ] + l_s(-1)\mathbf p_k - l_s(1)\mathbf p_{k+1} = 0 \hspace{1cm} \forall s=0,\dots,n-1\\ \frac{\partial H}{\partial \mathbf p}\bigg (\sum_{i=0}^{n-1}\mathbf q_k^i l_i(z_r),\mathbf p_r\bigg ) -\frac{2}{h}\ \sum_{i=0}^{n-1} \mathbf q_k^i \dot{l}_i(z_r)=0 \hspace{1cm} \forall r=0,\dots,m-1 \\ \sum_{i=0}^{n-1} \mathbf q_k^i l_i(-1) - \mathbf q_k = 0\\ \sum_{i=0}^{n-1} \mathbf q_k^i l_i(1) - \mathbf q_{k+1} = 0 \end{array}\right. $$


Within odeSPVI I implemented a geometric integrator for Hamiltonian systems, like odeSE and odeSV, which uses spectral variational integrators with any order for polynomials of the basis and with any number of internal quadrature nodes, for both unidimensional and multidimensional problems.

This solver just uses the stepper spectral_var_int.m which is the function where I implemented the resolution of the system displayed above; hamilton_equations must be a function_handle or an inline function where the user has to implement Hamilton's equations of motion: $$ \left\{\begin{array}{l} \dot{\mathbf q} = \frac{\partial H}{\partial \mathbf p}(\mathbf q,\mathbf p)\\ \dot{\mathbf p} = - \frac{\partial H}{\partial \mathbf q}(\mathbf q,\mathbf p) + \mathbf f(\mathbf q,\mathbf p)\end{array}\right.$$ where \(H(\mathbf q,\mathbf p)\) is the Hamiltonian of the system and \(\mathbf f(\mathbf q,\mathbf p)\) is an optional forcing term.

This below is the implementation of the stepper:

function [t_next,x_next,err]=spectral_var_int(f,t,x,dt,options)
fsolve_opts = optimset('Jacobian','on');

q_dofs = odeget(options,'Q_DoFs',[],'fast');
p_dofs = odeget(options,'P_DoFs',[],'fast');
nodes = odeget(options,'Nodes',[],'fast');
weights = odeget(options,'Weights',[],'fast');
leg = odeget(options,'Legendre',[],'fast');
deriv = odeget(options,'Derivatives',[],'fast');
extremes = odeget(options,'Extremes',[],'fast');

dim = length(x)/2;
N = q_dofs+p_dofs+2;
q0 = x(1:dim)(:)';
p0 = x(dim+1:end)(:)';

SPVI = @(y)svi_system(y,q_dofs,p_dofs,weights,leg, ...
y0 = [q0;zeros(q_dofs-1,dim);ones(p_dofs,1)*p0;q0;p0];
y0 = reshape(y0,dim*N,1);
z = fsolve(SPVI,y0,fsolve_opts);
%z = inexact_newton(SVI,y0,new_opts) %new_opts must be set
z = reshape(z,N,dim);

y = [leg*z(1:q_dofs,1:dim),z((q_dofs+1):(end-2),1:dim)];
x_next = [y;z(end-1,:),z(end,:)]';
t_next = [t+(dt/2)*(nodes+1), t+dt]';

q_dofs_err = odeget(options,'Q_DoFs_err',[],'fast');
p_dofs_err = odeget(options,'P_DoFs_err',[],'fast');
nodes_err = odeget(options,'Nodes_err',[],'fast');
weights_err = odeget(options,'Weights_err',[],'fast');
leg_err = odeget(options,'Legendre_err',[],'fast');
deriv_err = odeget(options,'Derivatives_err',[],'fast');
extreme_err = odeget(options,'Extremes_err',[],'fast');

N_err = q_dofs_err+p_dofs_err+2;
q1 = x_next(1:dim,end)(:)';
p1 = x_next(dim+1:end,end)(:)';

SPVI = @(y)svi_system(y,q_dofs_err,p_dofs_err, ...
weights_err,leg_err,deriv_err,extreme_err, ...
p_interp = zeros(p_dofs_err,dim);

p_lower_order = [p0;z(q_dofs+1:q_dofs+p_dofs,:);p1];
for i=1:1:p_dofs_err
p_interp(i,:) = .5*(p_lower_order(i,:) + ...

y0 = [z(1:q_dofs,:);zeros(1,dim);p_interp;q1;p1];
y0 = reshape(y0,dim*N_err,1);
z = fsolve(SPVI,y0,fsolve_opts);
%z = inexact_newton(SVI,y0,new_opts) %new_opts must be set
z = reshape(z,N_err,dim);

x_est = [z(end-1,:),z(end,:)]';
err = norm(x_est-x_next(:,end),2);
At lines 22 and 56 I put a comment line to show that it is possible to solve the implicit system also with inexact_newton (which is the one explained in the previous post on backward Euler), so it's not mandatory to use only fsolve. I will make it an option in order to let the user choose which solver to use.

Another important aspect to point out is that in case of an adaptive timestep the error is estimated using a new solution on the same timestep but with polynomials one order higher and one more internal node for the quadrature rule. Furthermore, for this last solution, the starting guess for fsolve is chosen in a non-trivial way: at line 53 we see that y0 has in the first q_dofs_err-1 rows the same modal values calculated before for the new solution x_next and a row of zeros just below. Then the starting nodal values for the momenta are set (lines 47-51) as a trivial average of new solution nodal values. This can seem wrong but I empirically stated that the number of iterations done by fsolve are the same of cases in which the reinitialization of nodal values is more sophisticated, so that is less computationally expensive to do a trivial average.

In the following code is shown the implementation of the system to be solved:

function [F,Jac] = svi_system(y,q_dofs,p_dofs,w,L,D,C, ...
F = zeros(N*dim,1);
V = zeros(p_dofs,dim*2);
X = zeros(dim*2,1);
W = reshape(y,N,dim);
for i = 1:1:p_dofs
X = [L(i,:)*W(1:q_dofs,:),W(i+q_dofs,:)]';
V(i,:) = f(t,X);
for i = 1:1:dim
F((1+N*(i-1)):(q_dofs+N*(i-1)),1) = (ones(1,p_dofs)* ...
(((w.*y((q_dofs+1+N*(i-1)):(q_dofs+p_dofs+N*(i-1))))* ...
ones(1,q_dofs)).*D + (((dt/2).*w.*V(:,i+dim))* ...
ones(1,q_dofs)).*L) + (p0(i)*ones(1,q_dofs)).*C(1,:) - ...
F(1+N*(i-1)+q_dofs:N*(i-1)+q_dofs+p_dofs,1) = V(:,i) - ...
F(N*i-1) = C(2,:)*y((1+N*(i-1)):(q_dofs+N*(i-1)))-y(N*i-1);
F(N*i) = C(1,:)*y((1+N*(i-1)):(q_dofs+N*(i-1))) - q0(i);
flag = 0;
flag = 1;
DV = zeros((dim*2)^2,p_dofs);
if( flag==1 )
for i = 1:1:p_dofs
X = [L(i,:)*W(1:q_dofs,:),W(i+q_dofs,:)]';
DV(:,i) = differential(f,t,X);
for i = 1:1:p_dofs
X = [L(i,:)*W(1:q_dofs,:),W(i+q_dofs,:)]';
[~,DV(:,i)] = f(t,X);
DV = DV';
for u=1:1:dim
Here, at line 46, I don't show the implementation of the Jacobian of the implicit system because, with this visualization style, it may appear very chaotic. The important aspect is that the user can use the implemented Jacobian to speedup the computation. I stated that it really allows to speedup the computation when using fsolve as solver. Furthermore the code tries to see if the user has implemented, in the function defining the Hamilton's equations, also the Hessian of the Hamiltonian (which is needed int the computation of the Jacobian of the system). If the function passes as second parameter the Hessian, then the program uses that one and the computation is faster, otherwise it approximates the Hessian with the following function and the computation will be slower:
function Hamilt = differential(f,t,x)
f_x = f(t,x);
dim_f = length(f_x);
dim_x = length(x);
if( dim_f ~= dim_x )
error('not implemented yet');
Hamilt = zeros(dim_f*dim_x,1);
delta = sqrt(eps);
for i = 1:1:dim_f
for j = i:1:dim_x
Hamilt(i+(j-1)*dim_f) = (1/delta)*(f(t,x+delta* ...
[zeros(j-1,1);1;zeros(dim_x-j,1)])(i) - f_x(i));
Hamilt(j+(i-1)*dim_x) = Hamilt(i+(j-1)*dim_f);

The Hessian of the Hamiltonian must be stored in a particular way (that I have to optimize yet, but the actual one works fine too) which is showed in the following example which is the definition of the Hamilton's equations for the armonic oscillator:

function [dy,ddy] = armonic_oscillator(t,y)
dy = zeros(size(y));
d = length(y)/2;
q = y(1:d);
p= y(d+1:end);

dy(1:d) = p;
dy(d+1:end) = -q;

H = eye(2*d);
ddy = transf(H);

function v = transf(H)
[r,c] = size(H);
v = zeros(r*c,1);
for i=1:1:r
for j=1:1:c
v(i+(j-1)*r) = H(i,j);

by Roberto Porcù ( at December 09, 2013 08:20 AM

December 02, 2013

Roberto Porcù

Symplectic Euler (semi-implicit Euler), Velocity-Verlet and Stormer-Verlet methods

Symplectic Euler method

The symplectic Euler method is a modification of the Euler method and is useful to solve Hamilton's equation of motion, that is a system of ODE where the unknowns are the generalized coordinates q and the generalized momenta p. It is of first order but is better than the classical Euler method because it is a symplectic integrator, so that it yelds better results.

Given a Hamiltonian system with Hamiltonian H=H(t;q,p) then the system of ODE to solve writes: $$\left\{\begin{array}{l} \dot{q} = \frac{dH}{dp}(t;q,p) = f(t;q,p)\\ \dot{p}=-\frac{dH}{dq}(t;q,p) = g(t;q,p) \end{array}\right. $$

From E. Hairer, C. Lubich and G. Wanner, "Geometric Numerical Integration" we can state that the semi-implicit Euler scheme for previous ODEs writes: $$ \left\{\begin{array}{l} p_{k+1} = p_k + h g(t_k;q_k,p_{k+1}) \\ q_{k+1}=q_k + h f(t_k;q_k,p_{k+1}) \end{array}\right. $$ If g(t;q,p) does not depend on p, then this scheme will be totally explicit, otherwise the first equation will be implicit and the second will be explicit.

The function that I created to make use of this integrator is odeSE.m and it passes to the integrate functions the stepper defined in the function symplectic_euler.m. The signature of odeSE is similar to those of the other ode solvers:

It's important to know that y0 must be a vector containing initial values for coordinates in its first half and initial values for momenta in its second half; the function handle (or inline function) ode_fcn must take exactly two input arguments (the first is the time and the second is the vector containing coordinates and momenta such as in y0) and returns a vector containing dq/dt in its first half and dp/dt in the second half.

options variable can be set with odeset and, if the system of ODE is explicit, the field options.Explicit can be set to 'yes' in order to speedup the computation. If tspan has only one element, then options.TimeStepNumber and options.TimeStepSize must not be empty, so that it will be used the integrate function integrate_n_steps.

This is the code of the stepper symplectic_euler.m:

function [x_next,err] = symplectic_euler(f,t,x,dt,options)
dim = length(x)/2;
q = x(1:dim);
p = x(dim+1:end);
if( strcmp(options.Explicit,'yes') )
p_next = p + dt*(f(t,[q; p])(dim+1:end));
q_next = q + dt*f(t,[q; p_next])(1:dim);
x_next = [q_next; p_next];
if(nargout == 2)
dt_new = dt/2;
p_next = p + dt_new*(f(t,[q; p])(dim+1:end));
q_next = q + dt_new*f(t,[q; p_next])(1:dim);
q = q_next;
p = p_next;
t = t+dt_new;
p_next = p + dt_new*(f(t,[q; p])(dim+1:end));
q_next = q + dt_new*f(t,[q; p_next])(1:dim);
x_est = [q_next;p_next];
err = norm(x_next-x_est,2);
SE1 = @(y) (y-p-dt*(f(t,[q; y])(dim+1:end)));
p_next = fsolve(SE1,zeros(size(p)));
q_next = q + dt*f(t,[q; p_next])(1:dim);
x_next = [q_next; p_next];
if(nargout == 2)
dt_new = dt/2;
SE1 = @(y) (y-p-dt_new*(f(t,[q; y])(dim+1:end)));
p_next = fsolve(SE1,zeros(size(p)));
q_next = q + dt_new*f(t,[q; p_next])(1:dim);
q = q_next;
p = p_next;
t = t+dt_new;
SE1 = @(y) (y-p-dt_new*(f(t,[q; y])(dim+1:end)));
p_next = fsolve(SE1,zeros(size(p)));
q_next = q + dt_new*f(t,[q; p_next])(1:dim);
x_est = [q_next;p_next];
err = norm(x_next-x_est,2);

Velocity-Verlet method

The velocity-Verlet method is a numerical method used to integrate Newton's equations of motion. The Verlet integrator offers greater stability, as well as other properties that are important in physical systems such as preservation of the symplectic form on phase space, at no significant additional cost over the simple Euler method.

If we call x the coordinate, v the velocity and a the acceleration then the equations of motion write: $$\left\{ \begin{array}{l} \frac{dx}{dt} = v(t,x)\\ \frac{dv}{dt} = a(t,x) \end{array} \right. $$

So that, given the initial conditions (coordinates and velocities), the velocity-verlet scheme writes: $$ \left\{ \begin{array}{l} x_{k+1} = x_k + h v_k + 0.5 h^2 a_k\\ v_{k+1} = v_k + 0.5 h (a_k + a_{k+1}) \end{array}\right. $$ where $$ a_{k+1} = a(t_{k+1},x_{k+1})\ .$$

This method is one order better than the symplectic Euler method. The global error of this method is of order two. Furthermore, if the acceleration indeed results from the forces in a conservative mechanical or Hamiltonian system, the energy of the approximation essentially oscillates around the constant energy of the exactly solved system, with a global error bound again of order two.

The function that uses velocity-Verlet scheme is odeVV.m and the stepper called at each iteration is velocity_verlet.m. The signature of odeVV is the same of those of others ODE solvers:


The documentation of input arguments is the same descripted in the previous symplectic Euler section, but there is the difference that now the function ode_fcn must return a vector containing the velocities in its first half and the expression of the acceleration in the second half.

This is the code of the stepper symplectic_euler.m:

function [x_next,err] = velocity_verlet(f,t,x,dt,options)
dim = length(x)/2;
q = x(1:dim);
v = x(dim+1:end);
a = f(t,x);
q_next = q + dt*v + .5*dt*dt*a(dim+1:end);
v_next = v + .5*dt*((a+f(t+dt,[q_next; v]))(dim+1:end));
x_next = [q_next; v_next];
if(nargout == 2)
dt_new = dt/2;
q_next = q + dt_new*v + .5*dt_new*dt_new*a(dim+1:end);
v_next = v + .5*dt_new*((a + ...
t = t+dt_new;
q = q_next;
v = v_next;
a = f(t,[q; v]);
q_next = q + dt_new*v + .5*dt_new*dt_new*a(dim+1:end);
v_next = v + .5*dt_new*((a + ...
x_est = [q_next; v_next];
err = norm(x_next - x_est,2);

Stormer-Verlet method

The Stormer-Verlet scheme is a symplectic integrator of order two, useful to integrate Hamiltonian systems in the form described in the previous symplectic Euler section. It is characterized by the approximation of the integral defining the discrete Lagrangian $$ L_d(q_k,q_{k+1})\approx\int_{t_k}^{t_{k+1}}L(t;q,\dot{q})dt $$ with the trapezoidal rule. So that $$ L_d(q_k,q_{k+1}) = \frac{h}{2}\bigg( L\Big(q_k,\frac{q_{k+1}-q_k}{h}\Big) + L\Big(q_{k+1},\frac{q_{k+1}-q_k}{h}\Big) \bigg)\ .$$

Considering again the system: $$ \left\{\begin{array}{l} \dot{q} = \frac{dH}{dp}(t;q,p) = f(t;q,p)\\ \dot{p}=-\frac{dH}{dq}(t;q,p) = g(t;q,p) \end{array}\right. \ ,$$ the scheme implemented for this integrator (described in E. Hairer, C. Lubich and G. Wanner, "Geometric Numerical Integration") writes as follows: $$ \left\{\begin{array}{l} p_{k+\frac{1}{2}} = p_k + \frac{h}{2}g(t_k;q_k,p_{k+\frac{1}{2}}) \\ q_{k+1}=q_k+\frac{h}{2}\Big( f(t_k;q_k,p_{k+\frac{1}{2}}) + f(t_{k+1};q_{k+1},p_{k+\frac{1}{2}})\Big) \\ p_{k+1} = p_{k+\frac{1}{2}} + \frac{h}{2} g(t_{k+\frac{1}{2}};q_{k+1},p_{k+\frac{1}{2}}) \end{array}\right. $$

The function in which this method is implemented is odeSV.m and it calls the stepper stormer_verlet.m. The documentation is the same of odeSE.m. Its implementation is the following:

function [x_next,err] = stormer_verlet(f,t,x,dt,options)
dim = length(x)/2;
q = x(1:dim);
p = x(dim+1:end);
if( strcmp(options.Explicit,'yes') )
p_mid = p + .5*dt*(f(t,[q; p])(dim+1:end));
q_next = q + .5*dt*((f(t,[q; p_mid])(1:dim))+ ...
p_next = p_mid +.5*dt*(f(t+dt,[q_next;p_mid])(dim+1:end));
x_next = [q_next; p_next];
if(nargout == 2)
dt_new = dt/2;
p_mid = p + .5*dt_new*(f(t,[q; p])(dim+1:end));
q_next = q + .5*dt_new*((f(t,[q; p_mid])(1:dim))+ ...
p_next = p_mid + .5*dt_new* ...
q = q_next;
p = p_next;
t = t+dt_new;
p_mid = p + .5*dt_new*(f(t,[q; p])(dim+1:end));
q_next = q + .5*dt_new*((f(t,[q; p_mid])(1:dim))+ ...
p_next = p_mid + .5*dt_new* ...
x_est = [q_next; p_next];
err = norm(x_est - x_next,2);
SV1 = @(y) (y - p - .5*dt*(f(t,[q;y])(dim+1:end)));
p_mid = fsolve(SV1,zeros(size(p)));
SV2 = @(y) (y - q - .5*dt*((f(t,[q;p_mid])(1:dim))+ ...
q_next = fsolve(SV2,q);
p_next = p_mid + .5*dt* ...
x_next = [q_next;p_next];
if(nargout == 2)
dt_new = dt/2;
SV1 = @(y) (y - p - .5*dt_new*(f(t,[q;y])(dim+1:end)));
p_mid = fsolve(SV1,zeros(size(p)));
SV2 = @(y) (y - q - .5*dt_new* ...
q_next = fsolve(SV2,q);
p_next = p_mid + .5*dt_new* ...
q = q_next;
p = p_next;
t = t+dt_new;
SV1 = @(y) (y - p - .5*dt_new*(f(t,[q;y])(dim+1:end)));
p_mid = fsolve(SV1,zeros(size(p)));
SV2 = @(y) (y - q - .5*dt_new* ...
q_next = fsolve(SV2,q);
p_next = p_mid + .5*dt_new* ...
x_est = [q_next; p_next];
err = norm(x_next-x_est,2);

by Roberto Porcù ( at December 02, 2013 05:15 AM

November 17, 2013

Marco Vassallo

update #16

The first version of the pkg has just been released !!!
Install it with

  >> pkg install fem-fenics -forge

I'm also writing a better documentation for the code. It will be ready soon, but for the moment you can give a look here.

by Gedeone ( at November 17, 2013 01:28 PM

November 15, 2013

Roberto Porcù

Ode solvers options

Ode solvers options

What I want to do at this moment of the project is to add to all the solvers I've written since now, the ability to manage all the possible options of the ODE structure (defined previously in odeset structure) that are already managed in corresponding MATLAB ode solvers; so that the solvers I've written will be totally MATLAB-compatible and this will be the same also for the new solvers I'm going to implement.

by Roberto Porcù ( at November 15, 2013 08:13 AM

November 12, 2013

Carnë Draug

Snapshot release of image package 2.1.1

The image package has accumulated a lot of changes since its last release and I’m hoping to make a new release soon (to match the release of Octave 3.8.0). I have prepared a snapshot tarball (version 2.1.1) with the current development status which can be installed easily with pkg. Would be great if users of the image package could give it a try and report any problems that the many changes may have caused.

It is important to note that this is not a final release. To make this clear, and to avoid that this becomes distributed as if it was, I made it dependent on an yet unreleased version of Octave (it is dependent on the development version of Octave anyway), and made it print a warning about it every time the package is loaded. After reading this, download the tarball and install it with pkg install -nodeps /path/to/tarball

I am partial to the changes in the new version, so these are the ones that I paid more attention to:

  • complete rewrite of imdilate and imerode with a performance improvement and many compatibility bugs fixed;
  • implementation of the @strel class and support for it in the image package functions;
  • support for N-dimensional matrices in several functions;
  • rewrite of the block processing functions which in some cases performs 1000x faster.

There are also a lot of bug fixes and new functions. Some will break backwards compatibility really bad but needed to be done for the sake of Matlab compatibility. For example, bwconncomp was returning indices for object perimeter but should have been returning indices for all elements in the objects. So do take a look at the NEWS file or use news image after installing the package.

After this release, I plan to follow the Octave core release method of keeping two development branches: a stable branch for minor releases with small bug fixes and regressions, and a default branch for big changes. Hopefully that will allow for more frequent releases as things of different levels get ready.

Please report any problems you have found either in Octave’s bug or patch trackers.

by carandraug at November 12, 2013 03:38 AM

November 11, 2013

Roberto Porcù



The following list explains the main points of the timeline for my SOCIS project:

  1. adapt odeset and odeget from Octave odepkg to be completely MATLAB-compatible: all option names supported by MATLAB should be available; furthermore we want the possibility to add new option names that will be useful for example for symplectic integrators;
  2. implement a new code structure for one-step integrators ode12, ode23, ode45, odefwe (forward Euler), odebwe (backward Euler) and inexact solvers, subdividing the execution into 'blocks' in order to be able to do an easier optimization and an easier extension to new functionalities; in particular implement the steppers which will be the functions called to do one integration step and then implement the integrate functions which will integrate, in different ways, over the whole time interval, calling the respective steppers. Afterwards adapt the interface of ode45 to be completely MATLAB-compatible;
  3. implement MATLAB-compatible version of deval;
  4. implement the following geometric integrators as new steppers:
    • symplectic Euler;
    • Stormer-Verlet;
    • velocity-Verlet;
    • spectral variational integrators;
    • SHAKE;
    • RATTLE.
  5. optimization, error checking and documentation of all the code written.

At the moment point 1 is completed; in point 2 I still have to adapt the interface of ode45 and to refine the implemented code. Point 3 has to be done. In point 4 I have implemented sympelctic Euler, Stormer-Verlet and spectral variational integrators but these need to be adapted to the code structure 'steppers', 'integrate functions'; the other integrators must be implemented. Finally, during the implementation I will start to write a bit of documentation and at the end I will do optimizations and debugging.

by Roberto Porcù ( at November 11, 2013 02:17 PM

October 25, 2013

Roberto Porcù

Starting Stepsize

Starting Step Size

The implementation of integrate_adaptive function takes (for now) the first timestep as last input argument. Obviously the choice of the first timestep made by the user may be not the best, even though the adaptive technique will adjust the step at the second iteration. So that there exist many techniques to find out a good timestep for the first iteration.

Here I present the one I implemented, which was proposed by Gladwell, Shampine and Brankin in 1987. It is based on the hypothesis that:$$ \text{local error}\approx Ch^{p+1}y^{(p+1)}(x_0) $$ The algorithm is the following, and usually gives a good guess for the initial timestep.

function h = starting_stepsize(f,x0,t0,p)
d0 = norm(x0,2);
d1 = norm(f(t0,x0),2);
if( d01.e-5 || d11.e-5 )
h0 = 1.e-6;
h0 = .01*(d0/d1);
x1 = x0 + h0.*f(t0,x0);
d2 = (1.0/h0)*norm(f(t0+h0,x1)-f(t0,x0),2);
if( max(d1,d2)= 1.e-15 )
h1 = max(1.e-6,h0*1.e-3);
h1 = (1.e-2 / max(d1,d2))^(1/(p+1));
h = min(100*h0,h1);

So that, what I will do is include this algorithm into integrate_adaptive function which will also have a new signature, because the initial timestep is no more needed:

function [x,t] = integrate_adaptive(stepper,order,f,x0,t0,t1)

by Roberto Porcù ( at October 25, 2013 08:26 AM

October 06, 2013

Wendy Liu

Sudoku code: another encryption scheme

Another useless encryption scheme devised by yours truly. While my last one (pi code) was primarily a substitution cipher, this one transcends all standard classifications; it's almost like a transposition cipher, but not really. The main idea here is the use of certain numbers in a particular Sudoku grid. The strength (if any) in this method lies in its unexpected nature; it certainly takes quite a leap of the imagination to correctly deduce the method from the ciphertext. Of course, once the method has been discovered, deciphering merely involves solving a Sudoku grid and then figuring out the substitution cipher used, meaning that the key is easy to determine and so this method kind of just looks at Kerckhoffs's Principle and then keeps walking. But that's okay, that's why this is filed under useless.

The setup

The first step is to find a Sudoku puzzle. I happened to have a Sudoku game installed, so I just ran that and started a random puzzle, which looked like this:

5 1 7 8
3 1 9 5 4
3 1 5
2 8
4 7 5 9
9 4
6 5 2
2 6 5 4 8
4 3 7 2

The (unique) solution to the above puzzle looks like this:

5 2 4 1 7 9 3 8 6
3 1 9 6 5 8 4 2 7
8 6 7 3 2 4 9 1 5
2 9 5 4 8 3 6 7 1
4 8 6 7 1 5 2 3 9
1 7 3 2 9 6 8 5 4
6 5 1 8 4 2 7 9 3
7 3 2 9 6 1 5 4 8
9 4 8 5 3 7 1 6 2

Clearly, each solved grid can be represented by a sequence of 81 numbers. Furthermore, due to the uniqueness of the solution, the initial grid and the solved grid are equivalent in the sense that the initial grid uniquely determines the solved grid. So we represent the initial grid as follows (with 0s standing for spaces):


This results in the following solution grid:


Now, depending on the length of the plaintext, we can choose somewhere between one and five numbers to use as the "holes" in the grid. For example, if we had a plaintext of 45 characters or fewer, we could choose the numbers 1, 2, 5, 6 and 9. Removing those numbers from the grid would leave us with this:

4 7 3 8
3 8 4 7
8 7 3 4
4 8 3 7 1
4 8 7 3
7 3 8 4
8 4 7 3
7 3 6 4 8
4 8 3 7

This has an entirely different arrangement of empty spaces than the initial grid. We can represent it using the following string:


We can then use the empty spaces to write our message, and add characters in the rest of the spaces to try and flatten frequencies. A simple substitution cipher should be used as well to prevent the ciphertext from looking too much like a transposition cipher. The alphabetic characters can then be interspersed with the numerals above, with the resulting string separated into chunks of 32 characters (with extra alphabetic characters appended as necessary, since only 81 are needed) so as to superficially resemble MD5 (because that's just fun).

Optimally, the solutions to the possible Sudoku grids being used should be stored in a dictionary somewhere, so that each grid only has to be solved once. The point of this encryption scheme is to be diabolically tricky but solvable, although perhaps only if you figure out the Sudoku aspect and know a crib or two.

Encoding and decoding a sequence

Let's say you want to encode:

admiral yamamoto to visit solomon islands eight am

First, remove spaces:


This is 43 characters, so let's aim for one grid with 5 numbers. This means there will be 2 extra invalid characters and 36 nonsense characters. Since some spaces will be fairly large, it's a good idea to use a simple substitution cipher as well (or even a less simple one like Vigenère for additional fiendishness). In this case, we'll use the reverse alphabet cipher:


So the encrypted plaintext becomes


Now put it in the grid:

z w 4 n 7 r 3 8 i
3 z o b z 8 4 n 7
8 z 7 3 n 4 l g l
g l e 4 8 3 r 7 h
4 8 r 7 g h l 3 o
l 7 3 n l m 8 r 4
h o z 8 4 m 7 w 3
7 3 h v r t s 4 8
g 4 8 z 3 7 n c a

(The last two characters are the extra invalid characters used to fill up the empty spaces.)

Now replace the remaining numbers with random letters, chosen so as to flatten the letter frequencies as much as possible:

z w e n a r o t i
q z o b z a e n s
f z t r n m l g l
g l e e a s r h h
t r r a g h l m o
l c c n l m z r v
h o z l p m p w p
a e h v r t s b d
g g f z e q n c a

Written as one line, the above grid would look like:


Recall that the original grid looks as follows:


With the numbers used added to the end of the grid, it might look like this:


Now, we randomly combine the two: (split across two lines here due to design considerations)


We could even put in extra characters at the end, since only the first 81 characters will be used:


Here's how it would look if we split it up into 32-character chunks:

z50w0ena1r7o008tiq0zo3bz1ae9nsf0 z5t0r4n0mlgl0g0le0e0as3rhh00tr0r ag1h5lm2ol00ccn0l8mzr0v0ho0z0400 lp7mpwp0a500eh9v0r0t0sbd090gg004 6f500ze02000q0020n6054c80400a3f7 0gsssg02ea1e5dllg9famezelr2ezz6x

And there you go. Not indecipherable but certainly very misleading. To decipher, first separate the numerals from the alphabetic characters, then write out the first 81 characters within a 9x9 grid. Then fill another 9x9 grid with the first 81 numerals (one digit per box), omitting the zeros, and solve the resulting Sudoku puzzle. Take the remaining (non-zero) digits and highlight the positions in the grid corresponding to those digits. The characters corresponding to the non-highlighted positions can be discarded. Now it only remains to reverse the substitution cipher and insert spaces where necessary.


dszb5c0ygo01ize7c0u0n80s31rwq9m0 5f0400ll0r00mw30atdx00r15gzs2000 800jq0z0n400bz7j05obo00rtbt90u0v f0u0oo9r0v00pmfvzoxxmg4650f00g20 0mnz000o20rgo605ub48ubzxxo040a03 70kteo02osz3dfle4eraosme2tn5sddq

Implementation in Python

All the relevant code is available in the repository for this website, under code/sudokucode. The 3 main files are shown below.

import json
import re
import random
import string

A_ORD = ord('a')
NOT_LETTERS_RE = re.compile('[^a-z]')
LETTERS_RE = re.compile('[a-z]')
NUMBERS_RE = re.compile('[0-9]')
POSSIBLE_DIGITS = map(str, range(1, 10))

# Does the reverse alphabet cipher thing on a string`
def reverse_cipher(plaintext):
    real_letters = []
    for letter in plaintext:
        real_letters.append(chr(A_ORD + 25 - (ord(letter) - A_ORD)))
        # whatever it works don't h8

    return ''.join(real_letters)

def contains_only_letters(plaintext):
    return is None

def sometimes():
    return random.choice((True, False))

class SudokuCoder:
    def __init__(self):
            self.grids = json.load(open('grids.json'))
        except IOError:
            # No grids in memory - can only encode, not decode.
            self.grids = {}

    def encode(self, plaintext):
        plaintext = plaintext.lower()
        num_letters = len(plaintext)
        letters = reverse_cipher(plaintext)

        if not contains_only_letters(plaintext):
            raise CannotEncodeError('Can only contain alphabetic characters.')

        if num_letters > 45:
            raise CannotEncodeError('Maximum plaintext length: 45 characters.')

        # Randomly choose a grid.
        initial_grid = random.choice(self.grids.keys())
        grid = self.grids[initial_grid]

        # Randomly choose some digits to use as the holes.
        num_digits = num_letters / 9 + 1
        digits = random.sample(POSSIBLE_DIGITS, num_digits)

        letter_index = 0
        new_grid = []

        # Now replace all the hole digits with the plaintext.
        for digit in grid:
            if digit in digits and letter_index < num_letters:
                letter_index += 1
                # Choose a random character
                # For both the extra ones and the nonsense ones

        # Add extra characters depending on the number of digits
        for digit in xrange(num_digits):

        total_digits = initial_grid + ''.join(digits)

        # Now randomly combine them.
        grid_length = len(new_grid)
        total_length = grid_length * 2
        ciphertext = []
        letter_index = 0
        digit_index = 0

        while letter_index < grid_length or digit_index < len(total_digits):
            if ((sometimes() and letter_index < grid_length) or
                digit_index == len(total_digits)):
                letter_index += 1
                digit_index += 1

        return ''.join(ciphertext)
    def decode(self, ciphertext):
        ciphertext = ciphertext.lower()

        # Get the grid numbers (the first 81 digits).
        all_numbers = NUMBERS_RE.findall(ciphertext)
        initial_grid = ''.join(all_numbers[:81])
        hole_numbers = all_numbers[81:]
        all_letters = LETTERS_RE.findall(ciphertext)
        grid_letters = all_letters[:81]

        # Check if the solution to this initial grid exists.
        if not initial_grid in self.grids:
            raise GridNotFoundError

        # Get the list indices of the hole numbers.
        solution_grid = self.grids[initial_grid]
        hole_indices = []

        for i in xrange(len(solution_grid)):
            if solution_grid[i] in hole_numbers:

        hole_letters = [grid_letters[index] for index in hole_indices]
        plaintext = reverse_cipher(hole_letters)

        return plaintext

    def add_grid(self, initial, solution):
        if not initial in self.grids:
            self.grids[initial] = solution
            json.dump(self.grids, open('grids.json', 'w'))
            raise GridAlreadyExists

class GridNotFoundError(Exception):

class CannotEncodeError(Exception):

class GridAlreadyExists(Exception):

import sys
import utils

if len(sys.argv) == 2:
    sudoku_coder = utils.SudokuCoder()
    print sudoku_coder.encode(sys.argv[1])
    print "Usage: python [plaintext]"

import sys
import utils

if len(sys.argv) == 2:
    sudoku_coder = utils.SudokuCoder()
    print sudoku_coder.decode(sys.argv[1])
    print "Usage: python [ciphertext]"


Instead of encrypting the Sudoku puzzle itself, use a standard one. for example, if you and your imaginary recipient both have access to the New York Times, which we will pretend has a daily Sudoku puzzle if it doesn't actually, then just use that. Then, the only numbers you'd have to send along with the plaintext (and the extra characters) are the hole digits. This means that the key itself (or at least the prelude to the key) does not have to be distributed along with the message, resulting in a somewhat more secure scheme (for a given value of "secure"). This will work as long as you ensure that the Sudoku puzzle that you choose always has a unique solution.

by dellsystem (Wendy Liu) at October 06, 2013 04:00 AM

September 26, 2013

Carnë Draug

GSoC 2013: imerode and imdilate (update #6)

Since I deviated so much from the original project while fixing the image IO functions in Octave core, I decided to only focus on optimizing the imerode and imdilate in the final phase of GSoC. The reason is that these are at the core of many functions in the image package.

On the original project it was planned to do all the work by expanding the __spatial_filtering__ function and that’s where I previously started. While doing so, it became evident that a complete rewrite was necessary. The convn() which could be used in most cases of binary images was performing much faster even though it was performing a more complex operation. As such, performing at least as fast as convn() became the target which was achieved:

octave> se = [0 1 0; 1 1 1; 0 1 0];
octave> im = rand (1000) > 0.5;
octave> t = cputime (); for i = 1:100, a = imerode (im, se, "same"); endfor; cputime() - t
ans = 2.1281
octave> t = cputime (); for i = 1:100, a = convn (im, se, "same") == nnz (se); endfor; cputime() - t
ans = 2.7802

This implementation could be reused in __spatial_filtering__ to also speed up functions such as medfilt2, ordfilt2, and ordfiltn but there are specific algorithms for those cases which should be used instead.

I have tested the new implementation of imerode against the last release (version 2.0.0) and the last development version that was still making use of __spatial_filtering__ (cset 6db5e3c6759b). The tests are very simple (test imerode), just random matrices with different number of dimensions. The new implementation seems to perform much faster in all cases, and shows a performance increase between 1.5-30x (output of test imerode). The differences are bigger for grayscale images (since imerode was already using convn for binary cases), and larger structuring elements (SE) with multiple dimensions.

A couple of things:

  • in the latest release of the image package (version 2.0.0), there was no erosion for multidimensional binary images (only grayscale);
  • both development versions make use of the new strel class. One of the things that it does, its to decompose structuring elements automatically, hence why the tests use a cross rather than a square for SE;
  • I’m only testing with the shape “same” since version 2.0.0 only had that one;
  • when using binary images I test with different percentages of true values since the new implementation is sensitive to it;
  • I do not compare the results since I know the new implementations also fix some bugs, specially related to the border pixels.
  • imdilate uses exactly the same code and so I’m assuming that the differences from imerode are the same.
version 2.0.0 (old) cset 6db5e3c6759b (dev) latest development (new) Performance old/new Performance dev/new
2D binary image (90%) 0.009081  0.024602 0.006240  1.4551  3.9423
2D binary image (10%)  0.007360  0.022881  0.004160  1.7692  5.5000
3D binary image with 2D SE  NO! 0.481470  0.079125 n/a  6.0849
3D binary image with 3D SE  NO!  0.518032  0.075605  n/a  6.8519
7D binary image with 5D SE  NO!  13.940071  0.463229  n/a  30.093
2D uint8 image  0.062324  0.043403  0.029322  2.1255  1.4802
3D uint8 image with 2D SE  NO NO  0.430347  n/a n/a
3D uint8 image with 3D SE  3.061951  1.725628  0.791569  3.8682  2.1800
7D uint8 image with 3D SE  NO NO 2.005325 n/a n/a
7D uint8 image with 7D SE  4.091456  2.940984  0.541634  7.5539  5.4298
2D uint8 image with a larger SE  0.610678  0.305579  0.087445  6.9835  3.4945


by carandraug at September 26, 2013 11:23 PM

Andrej Lojdl

The End

The Google Summer of Code 2013 is over. And this is great opportunity to summaries what is done and working and what still waits to be implemented in LaTeX interpreter.

New interpreters skeleton

As part of my project I made a new abstract class named base_text_render. This class presents the core functionality of every interpreter when using 'fltk' graphics toolkit. There is an text_render working as top wrapper pointing to one of three interpreters. User can directly pick ft_render or latex_render class. And if there is no necessary library to use the render user picked he will be pointed to dummy_render class, which can't do nothing basically. It's here just to prevent breaking of the program. Here is picture visualizing the class inheritance.

*- Class inheritance picture -*

User can also use 'tex' interpreter, this choice will point him to ft_render class. But instead of using freetype characters it will be using characters defined withing tex symbols. 

Implemented functionality

At the end of this Summer of Code there is core functionality within LaTeX interrpeter. It can be used only for on-screen rendering. Users can use following text properties
  • background color - can't be used directly as text property, but users can use it as the LaTeX command \colorbox{declared-color}{Text}
  • color - can be used within text properties as color parameter and directly in LaTeX as the command \textcolor{declared-color}{Text}
Users can use all color commands that LaTeX accept within package color.
  • fontangle - can't be used as the text property, just directly withing commands, \textit{Text} for 'italic', \textsl{Text} for 'oblique' text
  • fontname - can be used directly in Octave, but when using LaTeX interpreter acceptable fonts are only the Latex fonts
  • fontsize - can be used directly
  • fontweight - can be used from just as Latex commands, \textbf{Text} for the 'bold' font and \textmd{Text} 'demi'.
  • rotation - users can rotate text for 0, 90, 180 and 270 degrees.
  • horizontalalignment - acceptable is 'left', 'right' and 'center' as default alignment.
  • verticalalignment - users can use 'middle', 'top', 'baseline', 'cap' and 'bottom' as default.
NOTE: Default state of parameter are for the title, for other text object they can be different.

    All this is tested on Ubuntu 13.04 and it's working nicely and quite fast. Maybe a bit slower then when using ft_render. But the difference is tested and it present 10 - 50 mili seconds. This can be closer to upper number when using Octave with GUI. I'm quite confident that it will work same on all Linux based platforms. With some difference in speed, depending on how good is the hardware.

    Things that could/should be done

    It should be tested (it have been already and the test fails, so when my private obligations let me loose, I will try to fix it) on the Windows. After that normally on the some Mac system.

    The already working on screen rendering could be upgraded with some new functionality that already exist in the ft_render class. Let say lineheight property and better multi line rendering. 

    As the part of project, after start of coding my mentor sugested to implement this interpreter on the printing side too. But the time to do this didn't left, so for now it's on hold. Maybe there are another things that can be done, but as the author on this project I don't see them now.

    P.S: All files that I touched and they are relevant for this interpreter are txt-render.h/.cc ; latex-render.h/.cc ; txt-eng-ft.h/.cc and . My online repository is here:

    by AndrejLojdl ( at September 26, 2013 02:20 PM

    September 22, 2013


    Notes to Manage My Repository

    My repository infrastructure is as follows:
    To set up the version control environment, I need to modify the file .hg/hgrc and gnulib-hg/.hg/hgrc under working directory.

    -------------------- .hg/hgrc --------------------

    public =
    private = ssh://
    default =

    -------------------- gnulib-hg/.hg/hgrc --------------------
    default =

    I usually encounter the two situations below, "Develop and Commit" and "Merge the official development". "Develop and Commit" is used after I modify source code of octave. On the contrary, I use "Merge the official development" to merge the modification from others. (lyh is my bookmark)

    -------------------- Develop and Commit --------------------
    hg update lyh
    ... hack ...
    hg commit -m "development log"
    hg push private

    -------------------- Merge the official development --------------------
    hg update lyh
    hg pull default
    hg merge default
    hg commit -m "Merge the official development"
    hg push private

    P.S: If bookmark lyh doesn't exist at remote private repository. We should use hg push -B lyh private instead of hg push private.

    by L Y.H. ( at September 22, 2013 03:26 PM

    September 15, 2013

    Kai Torben Ohlhus

    That's all folks?

    Definitely not! Today is the GSoC "pencils down" day and I want to summarize the state of the project. You can get everything so far from here.

    Goals for the Midterm 2013-07-29 (Monday) :

    • Integration of ITSOL completed (,,,, callable from Octave
    • Wrapper file for MATLAB compatibility (ilu.m)
    • Small test cases for incomplete LU-factorization, showing their correctness
    • Documentation of incomplete LU-factorization types
    • Comprehensive test cases for all types of incomplete LU-factorization types (also  proofing compatibility to MATLAB)
    • (optional) Complex version for the incomplete-factorizations

    Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

    • Implementations for IC(0) and ICT completed
    • Wrapper file for MATLAB compatibility (ichol.m) finished
    • Small test cases for all incomplete-factorizations
    • Documentation of incomplete LU-factorization types
    • Comprehensive test cases for all types of incomplete-factorization types (also  proofing compatibility to MATLAB)
    • (optional) Complex version for the incomplete-factorizations

    What there is left to do?

    • ILUTP must get to work
    • Make ITSOL a dependency for Octave (there are still some flaws in the official distribution, that make it difficult to use)
    • IC routines (ichol0jp and icholt) still suffer from Fortran indexing and missing complex versions
    • Implement the Modified version for all preconditioners.
    • Everything has to find it's way to the official Octave development repository.

    by Kai Torben Ohlhus ( at September 15, 2013 05:08 PM

    September 14, 2013


    The int/uint Type Support

    Octave support signed and unsigned integers represented by 8, 16, 32, or 64 bits. Unlike other programming languages, integer data types of octave use saturation arithmetic rather than modular arithmetic. Saturation arithmetic is a version of arithmetic in which all operations are limited to a fixed range between a minimum and maximum value. If the result of an operation is greater than the maximum, it is set ("clamped") to the maximum; if it is below the minimum, it is clamped to the minimum.

    According to arithmetic IRs within LLVM are modular-based, we translate operations of int/uint as a external function call of helper function, which derived from Branchfree Saturating Arithmetic.

    Let's take a look of a simple example of uint16 to illustrate the performance improvement:

    -------------------- M-file --------------------
    a = uint16(1);
    b = uint16(1);

    init = uint16(0);
    bounds = uint16(10000);

    i = init;
    j = init;
    a = a + b;
    until (j == bounds)
    until (i == bounds)

    It's JITC linear IR and LLVM IR:

    -------------------- Compiling tree --------------------
      j = init;
        a = a + b;
      until(j == bounds)
    until(i == bounds)

    -------------------- octave jit ir --------------------
    body0:        %pred =
            uint16: i#30 = extract i
            uint16: bounds#23 = extract bounds
            any: ans#19 = extract ans
            uint16: b#8 = extract b
            uint16: a#6 = extract a
            any: j#3 = extract j
            uint16: init#1 = extract init
            branch: [live] do_until_body2
    do_until_body2:        %pred = body0, phi_split1
            uint16: init#72 phi | body0 -> uint16: init#1
                                | phi_split1 -> uint16: init#70
            uint16: i#68 phi | body0 -> uint16: i#30
                             | phi_split1 -> uint16: i#36
            uint16: bounds#65 phi | body0 -> uint16: bounds#23
                                  | phi_split1 -> uint16: bounds#66
            uint16: b#62 phi | body0 -> uint16: b#8
                             | phi_split1 -> uint16: b#63
            any: ans#59 phi | body0 -> any: ans#19
                            | phi_split1 -> any: ans#87
            uint16: a#56 phi | body0 -> uint16: a#6
                             | phi_split1 -> uint16: a#12
            uint16: #2 = call  (uint16: init#72)
            uint16: j#4 = uint16: #2
            branch: [live] do_until_body3
    do_until_body3:        %pred = do_until_body2, do_until_cond_check9
            uint16: j#73 phi | do_until_body2 -> uint16: j#4
                             | do_until_cond_check9 -> uint16: j#18
            uint16: init#70 phi | do_until_body2 -> uint16: init#72
                                | do_until_cond_check9 -> uint16: init#70
            uint16: i#69 phi | do_until_body2 -> uint16: i#68
                             | do_until_cond_check9 -> uint16: i#69
            uint16: bounds#66 phi | do_until_body2 -> uint16: bounds#65
                                  | do_until_cond_check9 -> uint16: bounds#66
            uint16: b#63 phi | do_until_body2 -> uint16: b#62
                             | do_until_cond_check9 -> uint16: b#63
            uint16: a#57 phi | do_until_body2 -> uint16: a#56
                             | do_until_cond_check9 -> uint16: a#12
            uint16: #7 = call  (uint16: a#57)
            uint16: #9 = call  (uint16: b#63)
            uint16: #10 = call binary+ (uint16: #7, uint16: #9)
            branch: [live] do_until_body4
    do_until_body4:        %pred = do_until_body3
            uint16: a#12 = uint16: #10
            uint16: #13 = call  (uint16: j#73)
            uint16: #14 = call unary++ (uint16: #13)
            branch: [live] do_until_body5
    do_until_body5:        %pred = do_until_body4
            uint16: #16 = call  (uint16: #13)
            branch: [live] do_until_body6
    do_until_body6:        %pred = do_until_body5
            uint16: j#18 = uint16: #14
            uint16: ans#20 = uint16: #16
            branch: [live] do_until_cond_check7
    do_until_cond_check7:        %pred = do_until_body6
            uint16: #22 = call  (uint16: j#18)
            uint16: #24 = call  (uint16: bounds#66)
            bool: #25 = call binary== (uint16: #22, uint16: #24)
            branch: [live] do_until_cond_check8
    do_until_cond_check8:        %pred = do_until_cond_check7
            bool: #27 = call logically_true (bool: #25)
            branch: [live] do_until_cond_check9
    do_until_cond_check9:        %pred = do_until_cond_check8
            cond_branch: bool: #27, [live] do_until_tail10, [live] do_until_body3
    do_until_tail10:        %pred = do_until_cond_check9
            uint16: #31 = call  (uint16: i#69)
            uint16: #32 = call unary++ (uint16: #31)
            branch: [live] do_until_tail11
    do_until_tail11:        %pred = do_until_tail10
            uint16: #34 = call  (uint16: #31)
            branch: [live] do_until_tail12
    do_until_tail12:        %pred = do_until_tail11
            uint16: i#36 = uint16: #32
            uint16: ans#37 = uint16: #34
            branch: [live] do_until_cond_check13
    do_until_cond_check13:        %pred = do_until_tail12
            uint16: #39 = call  (uint16: i#36)
            uint16: #40 = call  (uint16: bounds#66)
            bool: #41 = call binary== (uint16: #39, uint16: #40)
            branch: [live] do_until_cond_check14
    do_until_cond_check14:        %pred = do_until_cond_check13
            bool: #43 = call logically_true (bool: #41)
            branch: [live] do_until_cond_check15
    do_until_cond_check15:        %pred = do_until_cond_check14
            cond_branch: bool: #43, [live] do_until_tail16, [live] phi_split1
    phi_split1:        %pred = do_until_cond_check15
            any: #86 = call (any) (uint16: ans#37)
            any: ans#87 = any: #86
            branch: [live] do_until_body2
    do_until_tail16:        %pred = do_until_cond_check15
            branch: [live] final17
    final17:        %pred = do_until_tail16
            uint16: j#74 phi | do_until_tail16 -> uint16: j#18
            uint16: i#67 phi | do_until_tail16 -> uint16: i#36
            uint16: ans#58 phi | do_until_tail16 -> uint16: ans#37
            store a = uint16: a#12
            store ans = uint16: ans#58
            store b = uint16: b#63
            store bounds = uint16: bounds#66
            store i = uint16: i#67
            store init = uint16: init#70
            store j = uint16: j#74

    -------------------- llvm ir --------------------
    define void @foobar(%octave_base_value**) {
      %1 = getelementptr inbounds %octave_base_value** %0, i32 0
      %2 = getelementptr inbounds %octave_base_value** %0, i32 1
      %3 = getelementptr inbounds %octave_base_value** %0, i32 2
      %4 = getelementptr inbounds %octave_base_value** %0, i32 3
      %5 = getelementptr inbounds %octave_base_value** %0, i32 4
      %6 = getelementptr inbounds %octave_base_value** %0, i32 5
      %7 = getelementptr inbounds %octave_base_value** %0, i32 6
      br label %body

    body:                                             ; preds = %prelude
      %8 = load %octave_base_value** %1
      %9 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %8)
      %10 = load %octave_base_value** %2
      %11 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %10)
      %12 = load %octave_base_value** %3
      %13 = call %octave_base_value* @id_any(%octave_base_value* %12)
      %14 = load %octave_base_value** %4
      %15 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %14)
      %16 = load %octave_base_value** %5
      %17 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %16)
      %18 = load %octave_base_value** %6
      %19 = call %octave_base_value* @id_any(%octave_base_value* %18)
      %20 = load %octave_base_value** %7
      %21 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %20)
      br label %do_until_body

    do_until_body:                                    ; preds = %phi_split, %body
      %22 = phi i16 [ %21, %body ], [ %30, %phi_split ]
      %23 = phi i16 [ %9, %body ], [ %46, %phi_split ]
      %24 = phi i16 [ %11, %body ], [ %32, %phi_split ]
      %25 = phi i16 [ %15, %body ], [ %33, %phi_split ]
      %26 = phi %octave_base_value* [ %13, %body ], [ %52, %phi_split ]
      %27 = phi i16 [ %17, %body ], [ %37, %phi_split ]
      %28 = call i16 @id_uint16(i16 %22)
      br label %do_until_body1

    do_until_body1:                                   ; preds = %do_until_cond_check6, %do_until_body
      %29 = phi i16 [ %28, %do_until_body ], [ %39, %do_until_cond_check6 ]
      %30 = phi i16 [ %22, %do_until_body ], [ %30, %do_until_cond_check6 ]
      %31 = phi i16 [ %23, %do_until_body ], [ %31, %do_until_cond_check6 ]
      %32 = phi i16 [ %24, %do_until_body ], [ %32, %do_until_cond_check6 ]
      %33 = phi i16 [ %25, %do_until_body ], [ %33, %do_until_cond_check6 ]
      %34 = phi i16 [ %27, %do_until_body ], [ %37, %do_until_cond_check6 ]
      %35 = call i16 @id_uint16(i16 %34)
      %36 = call i16 @id_uint16(i16 %33)
      %37 = call i16 @octave_jit_add_uint16_uint16(i16 %35, i16 %36)
      br label %do_until_body2

    do_until_body2:                                   ; preds = %do_until_body1
      %38 = call i16 @id_uint16(i16 %29)
      %39 = call i16 @octave_jit_incr_uint16(i16 %38)
      br label %do_until_body3

    do_until_body3:                                   ; preds = %do_until_body2
      %40 = call i16 @id_uint16(i16 %38)
      br label %do_until_body4

    do_until_body4:                                   ; preds = %do_until_body3
      br label %do_until_cond_check

    do_until_cond_check:                              ; preds = %do_until_body4
      %41 = call i16 @id_uint16(i16 %39)
      %42 = call i16 @id_uint16(i16 %32)
      %43 = call i1 @"octave_jit==_uint16"(i16 %41, i16 %42)
      br label %do_until_cond_check5

    do_until_cond_check5:                             ; preds = %do_until_cond_check
      %44 = call i1 @id_bool(i1 %43)
      br label %do_until_cond_check6

    do_until_cond_check6:                             ; preds = %do_until_cond_check5
      br i1 %44, label %do_until_tail, label %do_until_body1

    do_until_tail:                                    ; preds = %do_until_cond_check6
      %45 = call i16 @id_uint16(i16 %31)
      %46 = call i16 @octave_jit_incr_uint16(i16 %45)
      br label %do_until_tail7

    do_until_tail7:                                   ; preds = %do_until_tail
      %47 = call i16 @id_uint16(i16 %45)
      br label %do_until_tail8

    do_until_tail8:                                   ; preds = %do_until_tail7
      br label %do_until_cond_check9

    do_until_cond_check9:                             ; preds = %do_until_tail8
      %48 = call i16 @id_uint16(i16 %46)
      %49 = call i16 @id_uint16(i16 %32)
      %50 = call i1 @"octave_jit==_uint16"(i16 %48, i16 %49)
      br label %do_until_cond_check10

    do_until_cond_check10:                            ; preds = %do_until_cond_check9
      %51 = call i1 @id_bool(i1 %50)
      br label %do_until_cond_check11

    do_until_cond_check11:                            ; preds = %do_until_cond_check10
      br i1 %51, label %do_until_tail12, label %phi_split

    phi_split:                                        ; preds = %do_until_cond_check11
      %52 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %47)
      br label %do_until_body

    do_until_tail12:                                  ; preds = %do_until_cond_check11
      br label %final

    final:                                            ; preds = %do_until_tail12
      %53 = phi i16 [ %39, %do_until_tail12 ]
      %54 = phi i16 [ %46, %do_until_tail12 ]
      %55 = phi i16 [ %47, %do_until_tail12 ]
      %56 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %37)
      store %octave_base_value* %56, %octave_base_value** %5
      %57 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %55)
      store %octave_base_value* %57, %octave_base_value** %3
      %58 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %33)
      store %octave_base_value* %58, %octave_base_value** %4
      %59 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %32)
      store %octave_base_value* %59, %octave_base_value** %2
      %60 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %54)
      store %octave_base_value* %60, %octave_base_value** %1
      %61 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %30)
      store %octave_base_value* %61, %octave_base_value** %7
      %62 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %53)
      store %octave_base_value* %62, %octave_base_value** %6
      ret void

    -------------------- optimized llvm ir --------------------
    define void @foobar(%octave_base_value**) {
      %1 = getelementptr inbounds %octave_base_value** %0, i64 1
      %2 = getelementptr inbounds %octave_base_value** %0, i64 2
      %3 = getelementptr inbounds %octave_base_value** %0, i64 3
      %4 = getelementptr inbounds %octave_base_value** %0, i64 4
      %5 = getelementptr inbounds %octave_base_value** %0, i64 5
      %6 = getelementptr inbounds %octave_base_value** %0, i64 6
      %7 = load %octave_base_value** %0, align 8
      %8 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %7)
      %9 = load %octave_base_value** %1, align 8
      %10 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %9)
      %11 = load %octave_base_value** %3, align 8
      %12 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %11)
      %13 = load %octave_base_value** %4, align 8
      %14 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %13)
      %15 = load %octave_base_value** %6, align 8
      %16 = call i16 @octave_jit_cast_uint16_any(%octave_base_value* %15)
      br label %do_until_body1

    do_until_body1:                                   ; preds = %prelude, %phi_split, %do_until_body1
      %17 = phi i16 [ %21, %do_until_body1 ], [ %16, %prelude ], [ %16, %phi_split ]
      %18 = phi i16 [ %18, %do_until_body1 ], [ %8, %prelude ], [ %23, %phi_split ]
      %19 = phi i16 [ %20, %do_until_body1 ], [ %14, %prelude ], [ %20, %phi_split ]
      %20 = call i16 @octave_jit_add_uint16_uint16(i16 %19, i16 %12)
      %21 = call i16 @octave_jit_incr_uint16(i16 %17)
      %22 = icmp eq i16 %21, %10
      br i1 %22, label %do_until_tail, label %do_until_body1

    do_until_tail:                                    ; preds = %do_until_body1
      %23 = call i16 @octave_jit_incr_uint16(i16 %18)
      %24 = icmp eq i16 %23, %10
      br i1 %24, label %final, label %phi_split

    phi_split:                                        ; preds = %do_until_tail
      %25 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %18)
      br label %do_until_body1

    final:                                            ; preds = %do_until_tail
      %26 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %20)
      store %octave_base_value* %26, %octave_base_value** %4, align 8
      %27 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %18)
      store %octave_base_value* %27, %octave_base_value** %2, align 8
      %28 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %12)
      store %octave_base_value* %28, %octave_base_value** %3, align 8
      %29 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %10)
      store %octave_base_value* %29, %octave_base_value** %1, align 8
      %30 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %10)
      store %octave_base_value* %30, %octave_base_value** %0, align 8
      %31 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %16)
      store %octave_base_value* %31, %octave_base_value** %6, align 8
      %32 = call %octave_base_value* @octave_jit_cast_any_uint16(i16 %10)
      store %octave_base_value* %32, %octave_base_value** %5, align 8
      ret void

    Execution time w/o JITC enabled is 7m32.402s. In contrast, the execution time would reduced to 0m0.667s.

    by L Y.H. ( at September 14, 2013 07:22 AM

    September 11, 2013

    Anirudha Bose


    Just In Time (JIT) compiler using LLVM backend is an experimental feature in Octave, which will make Octave capable of compiling during run-time. It can significantly speed up loops and even other parts of the program as well. The development of JIT for Octave is beeyond the scope of my project, and at it is a separate GSoC project of some other student. Here is how I added JIT support in the Octave build.

    Natively compiling LLVM using MXE and then using it to compile Octave is the standard way to get JIT support. This process has been followed to compile Octave for MinGW. However I saw that there are many discrepancies while following this process in Mac OS X. If you read the previous post in this blog, I had stated about the problems I had faced with dynamic library path names. I saw that there is a similar problem with the llvm-config utility and it turned out to be a self-referential symbolic link.
    This is the result of ls -l ./usr/bin/llvm-config
    lrwxr-xr-x  1 root  staff  48 Sep 11 02:29 ./usr/bin/llvm-config -> /Users/mac6-user1/mxe-octave/usr/bin/llvm-config

    On a Mac OS X system, there is no special need to compile LLVM separately because it is the default compiler present. The llvm-config utility happens to be installed in the path /opt/local/libexec/llvm-3.3/bin/llvm-config. I added this path in the environment variable LLVM_CONFIG and explicitly specified the --enable-jit option in the configure and Octave successfully compiled with JIT. Here are the configure details.

    LLVM CPPFLAGS:               -isystem /opt/local/libexec/llvm-3.3/include
    LLVM LDFLAGS:                -L/opt/local/libexec/llvm-3.3/lib
    LLVM libraries:              -lLLVMInstrumentation -lLLVMArchive -lLLVMLinker -lLLVMIRReader -lLLVMBitReader -lLLVMAsmParser -lLLVMDebugInfo -lLLVMOption -lLLVMipo -lLLVMVectorize -lLLVMBitWriter -lLLVMTableGen -lLLVMSystemZCodeGen -lLLVMSystemZAsmParser -lLLVMSystemZDesc -lLLVMSystemZInfo -lLLVMSystemZAsmPrinter -lLLVMHexagonCodeGen -lLLVMHexagonAsmPrinter -lLLVMHexagonDesc -lLLVMHexagonInfo -lLLVMNVPTXCodeGen -lLLVMNVPTXDesc -lLLVMNVPTXInfo -lLLVMNVPTXAsmPrinter -lLLVMMBlazeDisassembler -lLLVMMBlazeCodeGen -lLLVMMBlazeDesc -lLLVMMBlazeAsmPrinter -lLLVMMBlazeAsmParser -lLLVMMBlazeInfo -lLLVMCppBackendCodeGen -lLLVMCppBackendInfo -lLLVMMSP430CodeGen -lLLVMMSP430Desc -lLLVMMSP430Info -lLLVMMSP430AsmPrinter -lLLVMXCoreDisassembler -lLLVMXCoreCodeGen -lLLVMXCoreDesc -lLLVMXCoreInfo -lLLVMXCoreAsmPrinter -lLLVMMipsDisassembler -lLLVMMipsCodeGen -lLLVMMipsAsmParser -lLLVMMipsDesc -lLLVMMipsInfo -lLLVMMipsAsmPrinter -lLLVMARMDisassembler -lLLVMARMCodeGen -lLLVMARMAsmParser -lLLVMARMDesc  -lLLVMARMInfo -lLLVMARMAsmPrinter -lLLVMAArch64Disassembler -lLLVMAArch64CodeGen -lLLVMAArch64AsmParser -lLLVMAArch64Desc -lLLVMAArch64Info -lLLVMAArch64AsmPrinter -lLLVMAArch64Utils -lLLVMPowerPCCodeGen -lLLVMPowerPCDesc -lLLVMPowerPCAsmPrinter -lLLVMPowerPCAsmParser -lLLVMPowerPCInfo -lLLVMSparcCodeGen -lLLVMSparcDesc -lLLVMSparcInfo -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMX86Desc -lLLVMX86Info -lLLVMX86AsmPrinter -lLLVMX86Utils -lLLVMMCDisassembler -lLLVMMCParser -lLLVMInterpreter -lLLVMMCJIT -lLLVMJIT -lLLVMCodeGen -lLLVMObjCARCOpts -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMRuntimeDyld -lLLVMExecutionEngine -lLLVMTarget -lLLVMMC -lLLVMObject -lLLVMCore -lLLVMSupport

    JIT compiler for loops:             yes

    The final evaluation is close and since I have already a working build of Octave for Mac OS X, I will push all my code to my repo mxe-octave-anirudha now.

    by Anirudha at September 11, 2013 04:24 PM

    September 09, 2013

    Kai Torben Ohlhus

    One week left

    The final week is coming closer. Here is the current state of my project:

    Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

    • Implementations for IC(0) and ICT completed
    • Wrapper file for MATLAB compatibility (ichol.m) finished
    • Small test cases for all incomplete-factorizations (ICT missing)
    • Documentation of incomplete LU-factorization types (ICT missing)
    • Comprehensive test cases for all types of incomplete-factorization types (also  proofing compatibility to MATLAB) (ICT missing)
    • (optional) Complex version for the incomplete-factorizations (ICT missing)
    This week ICT and the remaining documentation and test cases will be on my schedule. Last week I implemented a Modified Incomplete Factorization (MIC) option for IC(0), which unfortunately is not 100% compatible to MATLAB. As far as I understood the MIC (see Saad (chapter 10.3.5) for example) you add after a completed step the dropped elements (here marked red) to the main diagonal. MATLAB seems to handle this differently. If anyone knows more, let me know.

    A = [ 0.37, -0.05,  -0.05,  -0.07;
         -0.05,  0.116,  0.0,   -0.05;
         -0.05,  0.0,    0.116, -0.05;
         -0.07, -0.05,  -0.05,   0.202];
    Normal Cholesky Factorization
    L1 = chol (A, 'lower');
    ans =
       0.608276253029822                   0                   0                   0
      -0.082199493652679   0.330519656364403                   0                   0
      -0.082199493652679  -0.020442828820163   0.329886850288205                   0
      -0.115079291113750  -0.179896893617439  -0.191390050272693   0.346068942669187
    Incomplete Cholesky Factorization no MIC
    opts.type = 'nofill';
    opts.michol = 'off';
    L2 = ichol (sparse (A), opts);
    ans =
       0.608276253029822                   0                   0                   0
      -0.082199493652679   0.330519656364403                   0                   0
      -0.082199493652679                   0   0.330519656364403                   0
      -0.115079291113750  -0.179896893617439  -0.179896893617439   0.352180311900522
    Incomplete Cholesky Factorization with MIC
    opts.michol = 'on';
    L3 = ichol (sparse (A), opts);
    ans =
       0.608276253029822                   0                   0                   0
      -0.082199493652679   0.320135106613577                   0                   0
      -0.082199493652679                   0   0.320135106613577                   0
      -0.115079291113750  -0.185732393077497  -0.185732393077497   0.346068942669187

    The way the dropped element is computed, no problem.
    drop = (A(3,2) - L2(2,1) * L2(3,1)) / L2 (2,2)
    drop =
    Expected compensated diagonal value
    D_2_2 = L2(2,2) + drop % * x
    D_2_2 =
    Some factor, the results differ with each other
    x = full(L3 (2,2) - L2(2,2)) / drop
    x =
    But the main diagonal D_2_2 must be computed first. The off diagonal can be computed with it.
    L_4_2 = (A(4,2) - L3(2,1) * L3(4,1)) / L3 (2,2)
    L_4_2 =

    The complex input handling works compatible to MATALB. An example:

    A = [ 0.37, -0.05,         -0.05,  -0.07;
         -0.05,  0.116,         0.0,   -0.05 + 0.05i;
         -0.05,  0.0,           0.116, -0.05;
         -0.07, -0.05 - 0.05i, -0.05,   0.202];
    opts.type = 'nofill';
    opts.michol = 'off';
    L = ichol (sparse (A), opts);
    norm (A - L * ctranspose (L), 'fro') / norm (A, 'fro')
    ans =  0.0195288512358234

    by Kai Torben Ohlhus ( at September 09, 2013 05:35 PM

    September 08, 2013

    Marco Vassallo

    update # 15

    This week I've started writing the documentation, and I have thus create a new page on the Octave wiki about the fem-fenics pkg. You can access it at the following link

    Also the pkg is almost ready for the first release so that I can start to get some feedback. At the moment, you can however download it from the mercurial repository:

    by Gedeone ( at September 08, 2013 11:09 AM

    September 07, 2013

    Ahsan Ali Shahid

    Agora Update #9

    Last week my focus was on improving and fixing bugs in bundle system. Currently there is a problem with the current bundle system that celery does not process newly uploaded bundles.Celery is a task queue system that allows us to use task queues asynchronously.  It is not possible for a server to simultaneously process uploaded bundles without using any task queue mechanism.

    So  I investigated the problem thoroughly .For that I worked directly on production server to find out where the main problem is , I found out that there was some unknown problem with celery that it closes unexpectedly. I told my mentor Jordi about the problem, he told me that I should use  celery with only one core by setting the concurrency to one while running celery.  I also tried that but the problem remained there. I also found out that the hosting on which the site is currently being run on closes the celery due to high resource usage.

    This should not happen because celery uses RabbitMQ by default , and RabbitMQ is a robust task queue system. I also studied extensively about the RabbitMQ and I have found out in FAQ that by doing only one change in the configuration ,  we can solve current problem that is being faced in running the bundle system. The good thing is that if a task is failed  before completion, Celery wont mark it as “done” . This settings  can be turned on by setting CELERY_ACKS_LATE to true in celery’s config file.

    So my finding is that the current hosting has some unknown problems and it is almost impossible for Agora website to run efficiently on current hosting.

    So I propose to my mentor that  we should opt for a good new web hosting for proper functioning of Agora because bundles system do not have any problem within themself, it is the fault of the current hosting that hinders the proper functioning of the bundles.

    I am also looking to integrate mechanism to make automatic backups.

    Thats all for now.

    by ahsanali at September 07, 2013 11:27 AM

    September 04, 2013

    Anirudha Bose


    It took me some time to figure out a way to enable functionality for sparse matrices in Octave. Initially, the configure script was unable to find the location where the sparse matrix libraries are present. I had to specify the paths in src/ like this:

    --with-umfpack-includedir='$(HOST_INCDIR)' \
    --with-umfpack-libdir='$(HOST_LIBDIR)' \
    --with-cxsparse-includedir='$(HOST_INCDIR)' \
    --with-cxsparse-libdir='$(HOST_LIBDIR)' \

    Manually specifying the paths worked, and Octave was built with the appropriate CPPFLAGS and LDFLAGS. The configure details of the libraries which add the functionality of sparse matrices in Octave are given below:

    AMD CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    AMD LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    AMD libraries: -lamd

    CAMD CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    CAMD LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    CAMD libraries: -lcamd

    CCOLAMD CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    CCOLAMD LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    CCOLAMD libraries: -lccolamd

    COLAMD CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    COLAMD LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    COLAMD libraries: -lcolamd

    CXSPARSE CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    CXSPARSE LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    CXSPARSE libraries: -lcxsparse

    UMFPACK CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    UMFPACK LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    UMFPACK libraries: -lumfpack

    CHOLMOD CPPFLAGS: -I/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/include
    CHOLMOD LDFLAGS: -L/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib
    CHOLMOD libraries: -lcholmod

    Even if Octave is built this way, you are likely to see some errors like the one below:

    dyld: Library not loaded:
    Referenced from: /Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/bin/octave
    Reason: image not found
    Trace/BPT trap: 5

    The hint to solve this came from Alexander Hansen. He suggested that /Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/bin/octave wants to link “”, which would be /Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib/ . When I did otool -L /Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/bin/octave, I saw that the sparse matrix libraries like and friends had just the names in the list of links, which should have been actual absolute paths. “otool” is a command that displays specified parts of object files or libraries.

    One solution which seemed valid to me was to change the install names of these dynamic shared libraries. Fortunately there is already a tool in Mac OS X called “install_name_tool” to do this job. The dynamic library can be made visible to octave by the following command:

    install_name_tool -change "" "/Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/lib/" /Users/mac6-user1/mxe/mxe-octave-anirudha/usr/x86_64-apple-darwin12.2.1/bin/octave

    I also had to recompile Octave with a new LDFLAG in order to reserve enough space for changing all dynamic shared library paths. I added this in the src/ file:

    There are a few other libraries which have this error, and can be fixed by following the above procedure. Finally when all the install names in the faulty libraries/object files are fixed, Octave is ready to be launched.

    by Anirudha at September 04, 2013 07:59 PM

    Carnë Draug

    GSoC 2013: done with ImageIO – please test (update #5)

    The imageIO section of the original project was much shorter than this. Originally it was limited to implement imformats, and then expand it to do the writing and reading of multipage images in
    a Matlab compatible way which is just implementing the options Index and Frames in imread, and WriteMode in imwrite. The date of that last commit was 16th of July. However, I didn’t stop there and tried to fix other incompatibilities with Matlab and add new options.

    Here’s a list of the main changes:

    • dealing with transparency: alpha channel is now returned as the third output of imread and only accepted with the Alpha option of imwrite. Previously, the alpha channel would be considered the 2nd or 4th element on the 3rd dimension for grayscale and RGB images respectively. This prevented the distinction between a RGB with transparency and a CMYK image. Also, since GraphicsMagick has the transparency values inverted from most other applications (Matlab inclusive), we were doing the same. This has been fixed.
    • CMYK images: implemented reading and writing of images in the CMYK colorspace.
    • imread options: in addition to Index and Frames, also implemented PixelRegion, which may speedup reading when only parts of the image are of interest, and Info, for compatibility only, which doesn’t have any effect at the moment.
    • imwrite options: in addition to WriteMode, implemented Alpha (as mentioned above).
    • indexed images: implement writing of indexed images. Previously, indexed images would be converted to RGB which would then be saved. An indexed image will now always return indexes independently if a colormap was requested as output (previously, a raster image would be returned in such cases).
    • floating point and int32 images: GraphicsMagick is integer exclusively and we can’t differentiate between an image saved with int32 or floating point. Since the latter are far more common Octave will now return a floating point image when GraphicsMagick reports a bitdepth of 32. Previously, images with more than a reported bitdepth of 16 would not be read at all (note that this is dependent on the GM build options. If GM’s quantum depth was 16 or 8, the most common, GM would report a file with 32 bitdepth with the quantum depth it was built with so it would read files with a bitdepth of 32 as long as it was not built with quantum depth 32).
    • bitdepth: imread now returns the actual bitdepth of the image in the file rather than trying to optimize it. This is still not perfect but we’re making a better guess at it than before (this should be very simple but with GraphicsMagick it’s not).
    • imfinfo: some new fields, specially reading of Exif and GPS data which allowed to deprecate readexif in the image package.

    In the end, I recognize that there’s some incompatibilities left to fix but I don’t know how anymore. As I mentioned on my previous post, GraphicsMagick is too smart for our use. I predict that we will eventually have to move to something different, if for nothing else, to read and write floating point images without trouble. I only had a quick glance on alternatives but FreeImage would seem like a good bet. Of course, there’s also the possibility to write our library wrapper around the format specific libraries.

    Would be nice if users could throw some images to imread and imwrite and report any bugs. I’m specially afraid of regressions since there was so much stuff changed and we don’t really have tests for these functions yet.

    by carandraug at September 04, 2013 05:56 AM

    September 02, 2013

    Wendy Liu

    How not to check the validity of an email address

    Another update, after this post somehow hit #1 on Hacker News and garnered my website more visitors in one hour than it usually gets in a year: I got a lot of feedback on this post. While a lot of it was positive, there was also some valid criticism about the amount of flaming in this post. While at first my response was along the lines of "h8rs gunna h8 lol", after some thought I realised that while posts like this can be fun to write, they can also be counterproductive and even harmful to the developer community. I know that if someone had found some of my code and decided to publicly rip it apart, I would almost certainly feel hurt — even if it were deserved (like that time I wrote a custom CMS and made it vulnerable to SQL injection).

    I was young and foolish when I wrote this post. Admittedly, I'm less than a week older now, but hopefully much less foolish. Although I don't regret writing this post, I don't think I will be making posts of a similar nature in the future. While I do believe that bad practices in the developer community should be called out, there are undoubtedly ways of getting the same message across without resorting to vitriol or an overly snarky tone — even if the end result runs the risk of being less entertaining. Posts like this have a tendency to exacerbate a culture that is already mired in toxicity, and while recognising this fact cannot completely absolve me, I hope that it will at least deter others from behaving similarly.

    I considered taking this post down, but that seemed silly considering that 99% of people who will ever read this post have, in all likelihood, already read it. In addition, I am no longer afraid of being sued for defamation by the vendor, since they seem to have taken it in stride and — amazingly enough — have even expressed an interest in interviewing me for a developer position as a result of this blog post. If the original programmer ever reads this post, I apologise for the public ridicule at your expense. All I can say is that I hope you no longer write code like this. Though if you do, please send it to me and I will do my utmost to write a sincere, snark-free analysis of it. <3

    As a sidenote, one thing I found rather surprising is the fact that a significant number of readers seemed to interpret this post as an ode to my hatred of drugs. This would have been ideal if I ever intended to run for office, but since I don't, it's just perplexing; I would have thought the tone of the post made it obvious enough, but since it clearly wasn't, I will clarify that I merely took an over-the-top approach and ran with it, in the interest of writing an article that would be more interesting to read than "Here's some code I found". It is a fairly lazy literary technique, though, and probably not one I will be using again anytime soon. (Hey, I never claimed this post was a literary masterpiece. I promise it will get better, though.)

    Pretend, for a moment, that you're a newly hired programmer working on a popular learning management system called Hot4Learning. Your predecessor had been working on adding email functionality to the system, so that any user would be able to send an email to any other user at the school via a web interface. Unfortunately, your predecessor was recently hit by a bus and never managed to complete his magnum opus. Your task is to finish it by adding an email validation feature, to ensure that an email is sent only if the recipient is a valid email address associated with the school.

    For example: if Bob is a student at McGill University, then he should be able to send an email to any valid or email address. So if his friend Jane's school email is, then Bob should be able to send an email to that address. On the other hand, he should not be able to send an email to Nor should he be able to send an email to

    So your job is to write this feature, assuming you have access to a list of valid emails for the school.

    I hope you're thinking, "Please, that's easy. First, I'll create a string by joining the emails in the list with the string '|****|', then I'll pass that to the client by assigning the string to a variable within a <script> tag. Next, in the Javascript, I'll instantiate an Array object and assign it to a variable endearingly named temp, then throw away the aforementioned Array object and set temp to be the array created by splitting the string on '|****|'. Whenever a user enters an email, I'll convert it to lowercase and store it in a variable with the beguiling name curForwardUserName, and I'll create a variable called validUserName which I will set to be the string '0', because why use a boolean when you can use a string? Then, I'll loop through the entirety of temp; for each email address in temp, I'll convert that to lowercase, then check if it's equal to curForwardUserName. If so, I'll set validUserName to the string '1', and break. Finally, if validUserName equals the string '1', I'll consider curForwardUserName to be a valid email address; otherwise, it's invalid.

    Actually, no. That's a lie. I really hope that's not what you're thinking. I hope what you actually thought was along the lines of "Well, obviously I'd do the validation on the server side. I can store the valid emails in a data structure that allows O(1) membership testing, then check if the input email is contained in there." Because, well, that's the right way to do it.

    This is your brain

    In Python, the right way to do it would look something like this:

    if input_email in valid_emails_set:
        send_email(input_email, another_param, etc)

    This is your brain on drugs

    On the other hand, if you're working on a product called Hot4Learning, you probably hate your job and need to consume a staggering amount of drugs in order to keep coming to work. This is what your drug-addled brain might come up with:

    Warning: NSFWUYWOH (Not Safe For Work Unless You Work On Hot4Learning)

    // [some other code here]
    var userNamesStr = '|****| |****|
    (pretend there are 70,000 more emails here)|****|';
    var temp = new Array();
    temp = userNamesStr.split('|****|');
    var validUserName = '0';
    // [more code here]
    for( i =0; i< temp.length; i++){
    if( curForwardUserName == temp[i].toLowerCase()) {
    validUserName = '1';
    // [some last bits of wisdom before we go]

    UPDATE: Yes, the above code does involve sending all 70,000+ McGill email addresses to the client upon each page load. The source code of the page was over 2.5MB in size. As in, 2.5MB of pure text.

    Since your reviewer would likely be as stoned as you, you would get a quick LGTM and the above code would soon find its way into production.

    The actual story

    Last year, the IT department at McGill University replaced the old LMS (Blackboard) with a new system (which is not actually called Hot4Learning, but the actual moniker is no less trite or gimmicky so I'll spare you). Since the old system was riddled with vulnerabilities and suffered from general all-around shittiness, I was excited to see what the new system would be like. That excitement soon died, just like brain cells do when exposed to copious amounts of drugs.

    Upon clicking the "email" tab, I was greeted with 10 seconds of white screen as my browser tried to load the page. "Why on Earth is this taking so long?" I asked myself. "It can't be, like, trying to load every single valid email address at McGill, right? Lol. Haha. Hahaha. Hahahaha. Wait. ohgod"

    Words cannot express the horror I felt when I looked at the source of the page and found the code that I have reproduced above. The only time I've been more terrified was when I dreamt that I slept through my cryptography exam and then began to suffocate (but that turned out to just be the blanket).

    If you want to take a look at this code in the wild, well, you can't anymore. I reported it as a data leak vulnerability and the vendor actually fixed it within two weeks. However, the code as it appeared then is essentially what you see above, shitty whitespace and all; I just removed some code irrelevant to the functionality at hand and also removed the actual list of emails. Apologies to Zamboni Man if you exist and are a student at McGill.

    I would really like to know the combination and quantities of drugs consumed that resulted in this code. Do you know? Can you hook me up? Send me a message on Twitter @dellsystem.

    Legal addendum

    I hereby claim that the above snippet of code is fair dealing according to section 29 of the Copyright Act of Canada, which allows for usage for the purpose of research, private study, education, parody, satire, criticism or review and news reporting. In fact, this snippet was used for nearly all of these purposes:


    I stumbled upon this snippet while doing research for my B.S. thesis, "Not even once: Prolonged drug use and its effect on code quality". It will be the perfect B.S. thesis, complete with B.S. graphs and footnotes in which I beg the reader to take me on as a grad student.

    Private study

    I saved the source code of the page because I wanted to privately study the correlation between prolonged drug use and code quality.


    I would like to educate the public on the deleterious effects of prolonged drug use on code quality.


    I wish I could say that the code above is a parody of the actual source code I encountered. I really do. But I can't.


    The code above was used as part of a scathing satirical indictment of the modern programmer's tragic penchant for drugs. Also, society.

    Criticism or review

    3/10 would not bang

    News reporting

    BREAKING: Prolonged drug usage can result in terrible code. CNN had better link back to me when they pick up this story, I could use the ad revenue.

    by dellsystem (Wendy Liu) at September 02, 2013 04:00 AM

    September 01, 2013

    Kai Torben Ohlhus

    Lost in translation with nice results

    Last week I spend reimplementing the IC(0) Fortran routines by Mark T. Jones and Paul E. Plassmann (see my last blog post) in C++. This effort was done, because

    • Octave is naively C++. Thus it is possible to adapt the code to more sophisticated data types (especially I think in complex sparse data types) so it will be able to handle more general input.
    • I am much more familiar with C++. This makes it easier for me to extend the code with features like the Modified Incomplete Cholesky Factorization (MIC) option and diagonal shifting.
    The result is accessible from my repository:
    • - The standard IC(0) routine
    • - IC(0) with Jones and Plassmann strategy (preserve number of non-zeros per row/column, but take the largest ones)
    For testing the implementation, I took the two examples from the last post and added one example test case from The m-files are available from here.

    Octaves output

    Test Problem 1

      err_ichol0   =  0.0197360236512918
      err_ichol0jp =  0.0197548896386716
    Test Problem 2
      err_ichol0   =  0.0840790133633081
      err_ichol0jp =  0.0846790338305638
    Test Problem 3
      err_ichol0   =  0.0915989875674983
      err_ichol0jp =  0.0926062848306512

    MATLAB (R2012b) output

    Test Problem 1
      err_ichol = 0.019736023651292

    Test Problem 2
      err_ichol = 0.084079013363308

    Test Problem 3
      err_ichol = 0.091598987567498

    Interesting was to observe, that the results of the standard IC(0) match very well with those of MATLAB. The Jones and Plassmann strategy does not differ that much from the standard IC(0), but according to their paper, it should speed up the preconditioned solution. So it will be offered as an additional options.type='jp' for example.

    by Kai Torben Ohlhus ( at September 01, 2013 04:32 PM

    Marco Vassallo

    update #14

    During the last weeks I have added some new functions to the code and worked on a new more involving example.

    The last functions added are:
    • SubSpace allows to extract a subspace from a vectorial one. For example, if our space is P2 x P0 we can extract the one or the other and then apply BC only where it is necessary.
    • fem_eval takes as input a Function and a coordinate and returns a vector representing the value of the function at this point.
    • for dealing with form of rank 0, i.e. with functional, we have now added the functions fem_create_functional to create it from a .ufl file. We have thus extended the function assemble which returns the corresponding double value.
    • plot_2d and plot_3d: these functions allow us to plot a function specifying a mesh and the value of the function at every node of the mesh. This is something which could be useful also outside of fem-fenics.
    I implemented these new functions mostly because I realized that I needed them for the new problem that I'm studying.

    The example which I'm studying is the Darcy-Stokes equation. More precisely, in this post I start to treat the problem of finding a FEM which shows stable convergence properties when the perturbation parameter tends to zero. Following Mardal, Tai, & Winther (2002) we study the equation
    What is shown in this paper, is that usual FEM methods for the Stokes equation like P2-P0, Crouzier-Ravart and Mini elements are not stable with respect to the perturbation parameter when it tends to zero. The authors have thus introduced a new non-conformal FE called "Mardal-Tai-Winther (MTW)", which shows stable converge properties.

    Because these elements are all available in Dolfin, I have thought that it could be a good test for our program to reproduce some of the results presented in the paper. What I have been able to reproduce so far is the P2-P0 case.

    Furthermore, while working on the example, I have unfortunately discovered that the MTW finite elements are not really available in Dolfin (they are partially supported) but, as you can read here, prof. Mardal seems to be ready to help us  and he offers some insight about his FEM.

    The ufl code is reported below, where I have added a Lagrange multiplier in order to ensure that the mean value of the pressure is equal to zero.

    Ufl code

    # P2-P0 finite element
    P2 = VectorElement("CG", triangle, 2)
    P0  = FiniteElement("DG", triangle, 0)

    R = FiniteElement ("R", triangle, 0)

    W = MixedElement ([P2, P0, R])

    (u, p, c) = TrialFunctions(W)
    (v, q, d) = TestFunctions(W)

    f = Coefficient(P2)
    g = Coefficient(P0)
    ep  = Constant(triangle)

    a = (dot(u, v) + ep*ep*div(u)*div(v) + ep*ep*dot(rot(u), rot(v)) + div(v)*p - div(u)*q + p*d + q*c)*dx
    L = (dot (f, v) - g*q)*dx

    Fem-fenics code

    pkg load msh
    pkg load fem-fenics

    fem_init_env ();

    fem_create_all ('DS');
    fem_create_functional ('Err_v');
    fem_create_functional ('Err_eps');
    fem_create_functional ('Err_p');

    uex = @(x, y) [-2*pi*sin(pi*x)*sin(pi*x)*cos(pi*y)*sin(pi*y);...

    pex = @(x, y) sin(pi*x);

    gradp = @(x, y) [pi*cos(pi*x); 0];

    laplu = @(x, y) [-4*pi^3*(1 - 4*sin(pi*x)*sin(pi*x))*sin(pi*y)*cos(pi*y);...
                      4*pi^3*(1 - 4*sin(pi*y)*sin(pi*y))*sin(pi*x)*cos(pi*x)];

    eps_vector = [1 2^-2 2^-4 2^-8 0]';
    nel_vector   = [4 8 16 32 64]';
    h_vector = 1./nel_vector;

    for ii = 1:numel(nel_vector)

      nel = nel_vector(ii);
      x = y = linspace (0, 1, nel + 1);
      msho = msh2m_structured_mesh (x, y, 1, 1:4);
      mshd = Mesh (msho);

      v = Expression ('v', uex);
      p = Expression ('p', pex);

      V  = DS_FunctionSpace (mshd);
      V0 = SubSpace (V, 0);

      bc = DirichletBC (V0, @(x,y) [0, 0], 1:4);

      for jj = 1:numel(eps_vector)

        epsi = eps_vector(jj);
        ff = @(x, y) uex (x, y) - epsi * epsi * laplu (x, y) - gradp (x, y);
        f = Expression ('f', ff);
        g = Expression ('g', @(x,y) 0);
        ep = Expression ('ep', @(x, y) epsi);

        a = DS_BilinearForm (V, ep);
        L = DS_LinearForm (V, f, g);

        [A, b] = assemble_system (a, L, bc);

        sol = A \ b;

        func = Function ('uh', V, sol);
        vh = Function ('vh', func, 0);
        ph = Function ('ph', func, 1);

        M = Err_v_Functional (mshd, v, vh);
        err_v_L2(ii,jj) = sqrt (assemble (M));

        const = fem_eval (ph, [0, 0]);
        pex = @(x, y) sin(pi*x) + const;
        p = Expression ('p', pex);
        M = Err_p_Functional (mshd, p, ph);
        err_p_L2(ii,jj) = sqrt (assemble (M));

        M = Err_eps_Functional (mshd, v, vh, ep);
        err_v_eps(ii,jj) = sqrt (assemble (M));



    pause ()

    rate_v_L2 = (log (err_v_L2 (1:end-1,:)) - log (err_v_L2 (2:end,:)))./((log (h_vector (1:end-1)) - log (h_vector (2:end))));
    conv_v_L2 = mean (rate_v_L2)

    rate_p_L2 = (log (err_p_L2 (1:end-1,:)) - log (err_p_L2 (2:end,:)))./((log (h_vector (1:end-1)) - log (h_vector (2:end))));
    conv_p_L2 = mean (rate_p_L2)

    rate_v_eps = (log (err_v_eps (1:end-1,:)) - log (err_v_eps (2:end,:)))./((log (h_vector (1:end-1)) - log (h_vector (2:end))));
    conv_v_eps = mean (rate_v_eps)


    The results that we obtain for the converege rate in the P2-P0 case are in good agreement with the ones reported by Mardal-Tai-Winther, as it can be checked in the table below. As it can be seen from the table, the P2-P0 elements aren't uniformly stable when the parameter approaches zero.

    Left: our results. Right: Mardal, Tai & Winther 2002

    by Gedeone ( at September 01, 2013 02:20 PM

    August 30, 2013

    Anirudha Bose


    For Octave to be compiled with Java support enabled we need to have some requisite JDK installation and header files available in the host system where Octave is built. For cross-mingw compilation, it was fairly difficult. You can refer to the previous posts on this blog related to adding Java support for more information.

    For Mac OS X, I installed the latest Oracle Java SE Development Kit from Oracle’s website. You need to accept the Oracle Binary Code License Agreement for Java SE to download the software.
    checking for java... /usr/bin/java
    checking for javac... /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/javac
    checking for jar... /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/jar
    checking for Java version... 1.6.0_51
    checking for libjvm.dylib... /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Libraries
    checking for include file <jni.h>... /System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers

    This is the platform specific jni_md.h file  used by configure. It comes with the JDK installation for Mac OS X.

    The following are extracts of the configure details of Octave:

    Java home: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
    Java JVM path: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Libraries
    Java CPPFLAGS: -I/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/include -I/System/Library/Frameworks/JavaVM.framework/Home/include -I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Headers -I/System/Library/Frameworks/Jav$
    Java libraries: -framework JavaVM

    Build Java interface: yes

    by Anirudha at August 30, 2013 11:53 AM

    August 29, 2013

    Ahsan Ali Shahid

    Agora Update #7 #8

    For the past two weeks , I worked with Jordi on Octave servers to deploy the updates that I had made to the Agora website. There was a lot to learn from that as I did not had any chance to work directly on any server. We applied updates to the site by merging the changes and then updating the central repo.Now at the moment Agora  has all the latest updates.Other features and bug fixes that have been done during this time period are:

    1. Fixed CSS issues of the website on the server.

    2. Fixed the randomly generated id for each snippet. Previously there was no mechanism to ensure that no two snippet has same unique url.But now each snippet has its own unique URL.

    3.  Applied schemamigration to the Agora database on production server so that newly added comments and ratings module can work properly. I screwed up the whole site once, but thanks to the precaution taken before applying the update that we had made backup of both the database and site files. South was used for the migration.

    There are some issues that needs fixing , but I strongly believe that those issues are server-side. One is when a new snippet is posted, a 403 not found error is thrown by the server, which is kinda strange and I am unable to figure out why such thing is happening although the snippet is successfully posted to the database.

    Now I am looking into implementing a way to import and export a bundle to an hg repo. That’s all from my side for now.


    by ahsanali at August 29, 2013 09:31 PM

    Anirudha Bose


    It is interesting to see that MXE Octave has become more capable than the original MXE. It can not only be used to cross-compile Octave, but is now capable of natively compiling Octave (mingw and msvc) as well. Since I have used MXE Octave to complete the first half of the project to cross-compile Octave to Windows, it is wise to re-use the same to natively compile Octave under Mac OS X. The builds, as far as I can say are supported on a range of platforms. This wouldn’t have been possible if I were to build an app bundle.

    On the Mac I am using for my work, config.guess returns x86_64-apple-darwin12.2.1. Also the command gcc -dumpmachine returns i686-apple-darwin11. Hence the target I am using to compile Octave is i686-apple-darwin11. Using x86_64-apple-darwin12.2.1 as the target results in many errors like "Undefined symbols for architecture x86_64: "

    I am writing some of the things I did to build a very basic version of Octave (lacks support for many libraries) for Mac OS X using my repo mxe-octave-anirudha.

    • Since GCC doesn’t work for native builds yet, so I used the system GCC provided by XCode. The variable USE_SYSTEM_GCC is automatically set to yes if we pass the --enable-native-build option to configure.
    • Till now I have compiled everything using the cross compile configurations making changes wherever required. On my system, the configure was unable to find the Fortran compiler. I had to manually add the path MXE_F77 := /opt/local/bin/gfortran-mp-4.7 in the Makefile as well as add set(CMAKE_Fortran_COMPILER /opt/local/bin/gfortran-mp-4.7) in usr/x86_64-apple-darwin12.2.1/share/cmake/mxe-conf.cmake. This was necessary for packages like BLAS, Lapack, and Octave. I will later commit a patch to detect the Fortran compiler path automatically.
    • The compilation is likely to stop when dbus is built. This is because dbus needs root privileges during build, else it will throw “Permission denied” error. Running sudo make dbus works fine.
    • Since we are building for a 64-bit target, we must use ./Configure darwin64-x86_64-cc instead of ./config to build package openssl.
    • Compiling Qt succeeds but MXE_PKG_CONFIG is unable to extract the QT_LDFLAGS and QT_LIBS. It is perhaps the reason why the build of Octave fails when GUI is enabled. The variable (PKG)_CONFIGURE_ENV must be set correctly for Mac OS X. To get past Qt build I used the MSVC configurations. The command pkg-config --libs-only-L ./usr/x86_64-apple-darwin12.2.1/lib/pkgconfig/QtCore.pc returns nothing, which might be the reason why Octave doesn’t build with GUI.
    • I had to disable Readline support by adding --disable-readline in src/ Since my repo uses Octave 3.7.5 I had to add the patch to make ctrl-c abort the actual octave command in linux (bug #37672). In Octave 3.7.6, the patch has been merged, so it must be removed from the src folder.

    After the build completed, Octave was configured for “x86_64-apple-darwin12.2.1″. I will be pushing the changesets to my repo soon. Octave can be launched by doing  ./usr/x86_64-apple-darwin12.2.1/bin/octave. It will launch the CLI version of Octave since GUI was disabled during build. Here is a screenshot.


    by Anirudha at August 29, 2013 08:11 AM

    August 26, 2013

    Kai Torben Ohlhus

    A Fortran story with Algorithm 740

    Last week I was looking for an good existing implementation of the Incomplete Cholesky Factorization and finally decided for Algorithm 740 by Mark T. Jones and Paul E. Plassmann. Next to this code, I had a deeper look at many promising libraries, but found nothing I could work upon. Algorithm 740 is a Fortran implementation of the no-fill Incomplete Cholesky Factorization [IC(0) in short] for column wise stored matrices. What do you want more?

    To see if this code works, I started writing an interface for Octave and tested it using two small toy problems from Quarteroni, Saleri, Gervasio: Scientific Computing with MATLAB and Octave: Third Edition, Springer 2010.

    % Problem 5.1 (Hydraulic network)

    A =[-0.37 0.05 0.05 0.07;
        0.05 -0.116 0 0.05;
        0.05 0 -0.116 0.05;
        0.07 0.05 0.05 -0.202];
    A = -A;

    % Test for positive definite
    for i = 1:length (A)
        assert (det (A (1:i,1:i)) > 0)

    % Prepare input for solver
    D = diag (A);
    A_off = sparse (tril (A - diag (D)));

    [L_off, D] = JPICC_wrapper (A_off, D);
    L = L_off + diag (D);

    % Compute error
    R = A - L * L'
    norm (R,'fro')

    >> norm = 0.0095646

    % Problem 5.4 (Capillary networks)

    A = [-1/4 1/10 1/10 0 0 0 0 0 0 0 0 0 0 0 0;
         0 -1/2 0 1/5 1/5 0 0 0 0 0 0 0 0 0 0;
         0 0 -1/2 0 0 1/5 1/5 0 0 0 0 0 0 0 0;
         0 0 0 -1 0 0 0 0.4 0.4 0 0 0 0 0 0;
         0 0 0 0 -1 0 0 0 0 0.4 0.4 0 0 0 0;
         0 0 0 0 0 -1 0 0 0 0 0 0.4 0.4 0 0;
         0 0 0 0 0 0 -1 0 0 0 0 0 0 0.4 0.4;
         0 0 0 0 0 0 0 -2 0 0 0 0 0 0 0;
         0 0 0 0 0 0 0 0 -2 0 0 0 0 0 0;
         0 0 0 0 0 0 0 0 0 -2 0 0 0 0 0;
         0 0 0 0 0 0 0 0 0 0 -2 0 0 0 0;
         0 0 0 0 0 0 0 0 0 0 0 -2 0 0 0;
         0 0 0 0 0 0 0 0 0 0 0 0 -2 0 0;
         0 0 0 0 0 0 0 0 0 0 0 0 0 -2 0;
         0 0 0 0 0 0 0 0 0 0 0 0 0 0 -2];
    A = -(A + triu (A, 1)');

    % Test for positive definite
    for i = 1:length (A)
        assert(det(A(1:i,1:i)) > 0)

    % Prepare input for solver
    D = diag (A);
    A_off = sparse (tril (A - diag(D)));

    [L_off, D] = JPICC_wrapper (A_off, D);
    L = L_off + diag(D);

    % Compute error
    R = A - L * L'

    >> norm = 0.53208

    The lack of a well-tested library like ITSOL makes this part of the GSoC more difficult. But MATLAB implements "only" IC(0), ICT and the modified IC option. So it will be much work left to do, but it should remain possible within the remaining weeks.

    by Kai Torben Ohlhus ( at August 26, 2013 06:41 PM

    Andrej Lojdl

    On screen LaTeX interpreter

    On screen rendering using LaTeX interpreter changed few times. There here problems regarding output image output format. First I used the PPM format, but it doesn't support transparency. So it was changed to a BMP format. After some time implementing the reader for this format and testing, result was that GhostScript don't work nice with this format. So it was changed again to PNG format.

    With new format there are nice transparent rendering of LaTeX file ( constructed in Octave ) and GhostScript use subsample anti aliasing to produce nice image. This data is then read from PNG image and transferred to OpenGL buffer.

    Upgraded interpreter have some new functionality. It can change color of text in octave directly and then apply this color to all text, that isn't previously colored. This exception is added because user can
    use coloring of text directly in input string this way user can have multicolored text. User can change the size of font directly in Octave using 'fontsize' parameter, example bellow.

    title('Text \color{red}{colored}','interpreter','latex','color','blue');

    To change size of text user can use 'fontsize' property. Trough this property user change the resolution of output from GhostScript and this is scaled to represent direct change of font size in points.


    And here is example of using the mathematics special characters:

    title('$\oint \! \nabla f \, \dif t = 0$','interpreter','latex','fontsize',48);

    As you can see it needs to be encapsulated inside dollar signs. Adding bold, italic and others type of text is an option, but as Matlab don't support this we won't support this also

    As you can see there are some things that could be upgraded. First and obvious is alignment, but this isn't the priority. Next step is to check for libraries and dependencies. And to configure this new interpreter to work on all large platforms. To check how development is evolving feel free to download from my online repository.

    by AndrejLojdl ( at August 26, 2013 03:49 PM

    August 25, 2013

    Kai Torben Ohlhus

    Little update

    The repository mess

    I recreated the structure of my public repository, because of non-conforming Mercurial commit messages. If you already downloaded changesets from my old repository, I recommend to strip them:

       hg strip -r MY_OLD_CHANGESET

    And to get the newer ones as described here.

    On my way to IC(0) and ICT

    To get these two implementations I had a look at several libraries again (many of them suggested by Nir):

    • Contains a very simple implementation of IC(0) in SparseLib++/1.7/src/
    • Implementation just consists of coping the lower triangular part, this can be done smarter according to the publications I read so far.
    • Also looks promising. But didn't got it to work so far with the MEX interface.
    • Good start for implementing an own version, Mex-Interface with C-Code
    • A deeper look at the underlying paper ( revealed, that their understanding of an "incomplete Cholesky Factorization" is: "If $Q$ is positive semidefinite and singular, then it is still possible to compute an “incomplete” Cholesky factorization $GG^T$, where some columns of $G$ are zero." That is definitely not what MATLABs ichol does. 
    • Not free, but authors are requested. I started to port this function to C++ in order to test it, because it looked as one of the most promising ones.

    New insights about Scilab and ILU

    During the research about Incompete Cholesky factorizations, I found an interesting toolbox Spilu at Scilab. The developers of Scilab took Saads SpimplementedarseKit as basis of their seemingly well tested Fortran functions for nearly all ILUs I interfaced from ITSOL. It is definitly worth to have a look at this library, but I don't know if it was reasonable to change ITSOL to Scilabs implementation. Unfortunately Scilab doesn't seem to have any IC factorization...

    News from Yousef Saad

    I created another public repository with the changes Marco Atzeri ( and I made to ITSOL The purpose of the repository is not to maintain an own version, it is to facilitate the decision process for upstream changes by Yousef Saad. He wants to consider my changes, when Ruipeng Li is soon available again.

    by Kai Torben Ohlhus ( at August 25, 2013 04:34 PM

    August 21, 2013


    The function Support

    Function is an important concept in programming language. Complicated octave programs can often be simplified by defining functions. JITC had implemented anonymous and inline function support last year, now it can be used to compile name function.

    Current JITC only support functions with single return value. The general form of single return value functions is:

    function ret-var = func-name (arglist)

    When a function is called, JITC would be triggered to compile its octave_user_function into LLVM IR. Here is the demo:

    -------------------- M-file --------------------
    function ret = fun(init, inc)
    ret = init;

    i = 0;
    j = 0;
    ret = ret + inc;
    until (j == 10000)
    until (i == 10000)

    result = fun(1, 1);

    Octave would read initialization files and execute. As previous says, this mechanism would trigger JITC. You may disable it with the following command:

    octave -f --no-init-path --debug-jit fun.m 

    JITC linear IR and LLVM IR for function fun are as follows:

    -------------------- Compiling function --------------------
    function ret = fun (init, inc)
    ret = init;
    i = 0;
      j = 0;
        ret = ret + inc;
      until(j == 10000)
    until(i == 10000)

    -------------------- octave jit ir --------------------
    body0:        %pred =
            any: #25 = call create_undef ()
            any: ans#26 = any: #25
            any: #10 = call create_undef ()
            any: j#11 = any: #10
            any: #6 = call create_undef ()
            any: i#7 = any: #6
            any: #3 = call create_undef ()
            any: ret#4 = any: #3
            scalar: inc#1 = extract inc
            scalar: init#0 = extract init
            scalar: #2 = call  (scalar: init#0)
            scalar: ret#5 = scalar: #2
            scalar: i#8 = scalar: 0
            branch: [live] do_until_body2
    do_until_body2:        %pred = body0, phi_split1
            scalar: ret#69 phi | body0 -> scalar: ret#5
                               | phi_split1 -> scalar: ret#18
            scalar: inc#63 phi | body0 -> scalar: inc#1
                               | phi_split1 -> scalar: inc#64
            scalar: i#60 phi | body0 -> scalar: i#8
                             | phi_split1 -> scalar: i#40
            any: ans#57 phi | body0 -> any: ans#26
                            | phi_split1 -> any: ans#82
            scalar: j#12 = scalar: 0
            branch: [live] do_until_body3
    do_until_body3:        %pred = do_until_body2, do_until_cond_check9
            scalar: ret#70 phi | do_until_body2 -> scalar: ret#69
                               | do_until_cond_check9 -> scalar: ret#18
            scalar: j#65 phi | do_until_body2 -> scalar: j#12
                             | do_until_cond_check9 -> scalar: j#24
            scalar: inc#64 phi | do_until_body2 -> scalar: inc#63
                               | do_until_cond_check9 -> scalar: inc#64
            scalar: i#61 phi | do_until_body2 -> scalar: i#60
                             | do_until_cond_check9 -> scalar: i#61
            scalar: #14 = call  (scalar: ret#70)
            scalar: #15 = call  (scalar: inc#64)
            scalar: #16 = call binary+ (scalar: #14, scalar: #15)
            branch: [live] do_until_body4
    do_until_body4:        %pred = do_until_body3
            scalar: ret#18 = scalar: #16
            scalar: #19 = call  (scalar: j#65)
            scalar: #20 = call unary++ (scalar: #19)
            branch: [live] do_until_body5
    do_until_body5:        %pred = do_until_body4
            scalar: #22 = call  (scalar: #19)
            branch: [live] do_until_body6
    do_until_body6:        %pred = do_until_body5
            scalar: j#24 = scalar: #20
            scalar: ans#27 = scalar: #22
            branch: [live] do_until_cond_check7
    do_until_cond_check7:        %pred = do_until_body6
            scalar: #29 = call  (scalar: j#24)
            bool: #30 = call binary== (scalar: #29, scalar: 10000)
            branch: [live] do_until_cond_check8
    do_until_cond_check8:        %pred = do_until_cond_check7
            bool: #32 = call logically_true (bool: #30)
            branch: [live] do_until_cond_check9
    do_until_cond_check9:        %pred = do_until_cond_check8
            cond_branch: bool: #32, [live] do_until_tail10, [live] do_until_body3
    do_until_tail10:        %pred = do_until_cond_check9
            scalar: #35 = call  (scalar: i#61)
            scalar: #36 = call unary++ (scalar: #35)
            branch: [live] do_until_tail11
    do_until_tail11:        %pred = do_until_tail10
            scalar: #38 = call  (scalar: #35)
            branch: [live] do_until_tail12
    do_until_tail12:        %pred = do_until_tail11
            scalar: i#40 = scalar: #36
            scalar: ans#41 = scalar: #38
            branch: [live] do_until_cond_check13
    do_until_cond_check13:        %pred = do_until_tail12
            scalar: #43 = call  (scalar: i#40)
            bool: #44 = call binary== (scalar: #43, scalar: 10000)
            branch: [live] do_until_cond_check14
    do_until_cond_check14:        %pred = do_until_cond_check13
            bool: #46 = call logically_true (bool: #44)
            branch: [live] do_until_cond_check15
    do_until_cond_check15:        %pred = do_until_cond_check14
            cond_branch: bool: #46, [live] do_until_tail16, [live] phi_split1
    phi_split1:        %pred = do_until_cond_check15
            any: #81 = call (any) (scalar: ans#41)
            any: ans#82 = any: #81
            branch: [live] do_until_body2
    do_until_tail16:        %pred = do_until_cond_check15
            branch: [live] final17
    final17:        %pred = do_until_tail16
            scalar: j#66 phi | do_until_tail16 -> scalar: j#24
            scalar: i#59 phi | do_until_tail16 -> scalar: i#40
            scalar: ans#56 phi | do_until_tail16 -> scalar: ans#41
            call destroy (scalar: ans#56)
            call destroy (scalar: i#59)
            call destroy (scalar: inc#64)
            call destroy (scalar: init#0)
            call destroy (scalar: j#66)
            return scalar: ret#18

    -------------------- raw function --------------------
    define double @foobar(double, double) alwaysinline {
      br label %body

    body:                                             ; preds = %prelude
      %2 = call %octave_base_value* @octave_jit_create_undef()
      %3 = call %octave_base_value* @octave_jit_create_undef()
      %4 = call %octave_base_value* @octave_jit_create_undef()
      %5 = call %octave_base_value* @octave_jit_create_undef()
      %6 = call double @id_scalar(double %0)
      call void @octave_jit_release_any(%octave_base_value* %5)
      call void @octave_jit_release_any(%octave_base_value* %4)
      br label %do_until_body

    do_until_body:                                    ; preds = %phi_split, %body
      %7 = phi double [ %6, %body ], [ %17, %phi_split ]
      %8 = phi double [ %1, %body ], [ %13, %phi_split ]
      %9 = phi double [ 0.000000e+00, %body ], [ %25, %phi_split ]
      %10 = phi %octave_base_value* [ %2, %body ], [ %30, %phi_split ]
      br label %do_until_body1

    do_until_body1:                                   ; preds = %do_until_cond_check6, %do_until_body
      %11 = phi double [ %7, %do_until_body ], [ %17, %do_until_cond_check6 ]
      %12 = phi double [ 0.000000e+00, %do_until_body ], [ %19, %do_until_cond_check6 ]
      %13 = phi double [ %8, %do_until_body ], [ %13, %do_until_cond_check6 ]
      %14 = phi double [ %9, %do_until_body ], [ %14, %do_until_cond_check6 ]
      %15 = call double @id_scalar(double %11)
      %16 = call double @id_scalar(double %13)
      %17 = call double @"octave_jit_+_scalar"(double %15, double %16)
      br label %do_until_body2

    do_until_body2:                                   ; preds = %do_until_body1
      %18 = call double @id_scalar(double %12)
      %19 = call double @"octave_jit_++"(double %18)
      br label %do_until_body3

    do_until_body3:                                   ; preds = %do_until_body2
      %20 = call double @id_scalar(double %18)
      br label %do_until_body4

    do_until_body4:                                   ; preds = %do_until_body3
      br label %do_until_cond_check

    do_until_cond_check:                              ; preds = %do_until_body4
      %21 = call double @id_scalar(double %19)
      %22 = call i1 @"octave_jit==_scalar"(double %21, double 1.000000e+04)
      br label %do_until_cond_check5

    do_until_cond_check5:                             ; preds = %do_until_cond_check
      %23 = call i1 @id_bool(i1 %22)
      br label %do_until_cond_check6

    do_until_cond_check6:                             ; preds = %do_until_cond_check5
      br i1 %23, label %do_until_tail, label %do_until_body1

    do_until_tail:                                    ; preds = %do_until_cond_check6
      %24 = call double @id_scalar(double %14)
      %25 = call double @"octave_jit_++"(double %24)
      br label %do_until_tail7

    do_until_tail7:                                   ; preds = %do_until_tail
      %26 = call double @id_scalar(double %24)
      br label %do_until_tail8

    do_until_tail8:                                   ; preds = %do_until_tail7
      br label %do_until_cond_check9

    do_until_cond_check9:                             ; preds = %do_until_tail8
      %27 = call double @id_scalar(double %25)
      %28 = call i1 @"octave_jit==_scalar"(double %27, double 1.000000e+04)
      br label %do_until_cond_check10

    do_until_cond_check10:                            ; preds = %do_until_cond_check9
      %29 = call i1 @id_bool(i1 %28)
      br label %do_until_cond_check11

    do_until_cond_check11:                            ; preds = %do_until_cond_check10
      br i1 %29, label %do_until_tail12, label %phi_split

    phi_split:                                        ; preds = %do_until_cond_check11
      %30 = call %octave_base_value* @octave_jit_cast_any_scalar(double %26)
      br label %do_until_body

    do_until_tail12:                                  ; preds = %do_until_cond_check11
      br label %final

    final:                                            ; preds = %do_until_tail12
      %31 = phi double [ %19, %do_until_tail12 ]
      %32 = phi double [ %25, %do_until_tail12 ]
      %33 = phi double [ %26, %do_until_tail12 ]
      %34 = call double @id_scalar(double %33)
      %35 = call double @id_scalar(double %32)
      %36 = call double @id_scalar(double %13)
      %37 = call double @id_scalar(double %0)
      %38 = call double @id_scalar(double %31)
      ret double %17

    -------------------- optimized and wrapped --------------------
    define %octave_base_value* @fun_wrapper(%octave_base_value**) alwaysinline {
      %1 = load %octave_base_value** %0, align 8
      %2 = call double @octave_jit_cast_scalar_any(%octave_base_value* %1)
      %3 = getelementptr inbounds %octave_base_value** %0, i64 1
      %4 = load %octave_base_value** %3, align 8
      %5 = call double @octave_jit_cast_scalar_any(%octave_base_value* %4)
      %6 = call %octave_base_value* @octave_jit_create_undef()
      %7 = call %octave_base_value* @octave_jit_create_undef()
      %8 = call %octave_base_value* @octave_jit_create_undef()
      %9 = call %octave_base_value* @octave_jit_create_undef()
      call void @octave_jit_release_any(%octave_base_value* %9)
      call void @octave_jit_release_any(%octave_base_value* %8)
      br label %do_until_body.i

    do_until_body.i:                                  ; preds = %phi_split.i, %body
      %10 = phi double [ %2, %body ], [ %14, %phi_split.i ]
      %11 = phi double [ 0.000000e+00, %body ], [ %17, %phi_split.i ]
      br label %do_until_body1.i

    do_until_body1.i:                                 ; preds = %do_until_body1.i, %do_until_body.i
      %12 = phi double [ %10, %do_until_body.i ], [ %14, %do_until_body1.i ]
      %13 = phi double [ 0.000000e+00, %do_until_body.i ], [ %15, %do_until_body1.i ]
      %14 = fadd double %5, %12
      %15 = fadd double 1.000000e+00, %13
      %16 = fcmp ueq double %15, 1.000000e+04
      br i1 %16, label %do_until_tail.i, label %do_until_body1.i

    do_until_tail.i:                                  ; preds = %do_until_body1.i
      %17 = fadd double 1.000000e+00, %11
      %18 = fcmp ueq double %17, 1.000000e+04
      br i1 %18, label %foobar.exit, label %phi_split.i

    phi_split.i:                                      ; preds = %do_until_tail.i
      %19 = call %octave_base_value* @octave_jit_cast_any_scalar(double %11)
      br label %do_until_body.i

    foobar.exit:                                      ; preds = %do_until_tail.i
      %20 = call %octave_base_value* @octave_jit_cast_any_scalar(double %14)
      ret %octave_base_value* %20

    by L Y.H. ( at August 21, 2013 09:13 AM

    August 20, 2013

    Vytautas Jančauskas

    Audio File Reading and Writing

    I have implemented audioinfo and audioread and they should work exactly as in MATLAB. Here is an example session.

    octave:1> audioinfo('technical_difficulties.wav')
    ans =

      scalar structure containing the fields:

        Filename = technical_difficulties.wav
        CompressionMethod =
        NumChannels =  2
        SampleRate =  44100
        TotalSamples =  305576
        Duration =  6.9292
        BitsPerSample =  16
        BitRate = -1
        Title =
        Artist =
        Comment =

    octave:2> [y, Fs] = audioread('technical_difficulties.wav');
    octave:3> player = audioplayer(y, Fs);
    octave:4> play(player);
    For further information about this functionality refer to MATLAB manual pages on these functions: here and here.

    by Vytautas Jančauskas ( at August 20, 2013 11:07 AM

    Kai Torben Ohlhus

    Getting ITSOL to work

    At the moment I'm writing an interface for GNU/Octave, to make use of the ITSOL library. This post should be a collection of information how to get this library on your machine.
    As far as I figured out, there are is only an outdated package maintained in the Debian repository for ITSOL, so one has to build the library and copy the necessary files to the file system.

    Build the ITSOL-library from source

    1. Get the source tar-archive
    2. Extract the archive: tar -xvf ITSOL_2.tar.gz
    3. Get my patch
    4. Apply the patch to the extracted archive:
      • cd ITSOL_2
      • patch -p1 < itsol.patch
    5. Build the library using: make
      • Now the shared and static library are created in ITSOL_2/LIB
    6. Copy the header files from ITSOL_2/INC to /usr/include/itsol (root rights may be required)
    7. Copy the two libraries from ITSOL_2/LIB to /usr/lib
    8. Make symbolic links
      • ln -s
      • ln -s

    Hint for steps 6+7: If you choose a custom location, you can build Octave later using the configure options:
    • --with-itsol-includedir=DIR
    • --with-itsol-libdir=DIR
    See configure --help for details



    • MinGW: ??
    • Cygwin: ??


    • ??

    by Kai Torben Ohlhus ( at August 20, 2013 03:49 AM

    Becoming complex with ZTISOL

    Two open tasks are remaining for me until the midterm next week. I want to write more comprehensive tests and I have to get ILUTP to work. For the latter I don't think to succeed without further support by the authors of ITSOL, which I already contacted.

    While waiting for a response, Nir and I decided to focus on my optional goal of complex ILU versions which are realized in the ZITSOL library. It is straight forward to interface them analogous to the real versions. After tiding up my code, I will submit the changes to my development repository this weekend. As a first step I spend some time getting ZITSOL in a appropriate library format analog to the ITSOL library.

    Build the ZITSOL-library from source

    1. Get the source tar-archive
    2. Extract the archive: tar -xvf ZITSOL_1.tar.gz
    3. Get my patch
    4. Apply the patch to the extracted archive:
      • cd ZITSOL_1
      • patch -p1 < itsol.patch
    5. Build the library using: make
      • Now the shared and static library are created in ZITSOL_1
    6. Copy the four header files from ZITSOL_1 and ZITSOL_1/LIB to /usr/include/zitsol (root rights may be required)
    7. Copy the two libraries from ZITSOL_1 to /usr/lib
    8. Make symbolic links
      • ln -s
      • ln -s

    by Kai Torben Ohlhus ( at August 20, 2013 03:48 AM

    Midterm summary

    Today it is time for the midterm and to summarize what has already been done. Here is again my favorite picture to show my progress and afterwards some text.

    Goals for the Midterm:

    • Integration of ITSOL completed (,,,, callable from Octave
    • Wrapper file for MATLAB compatibility (ilu.m)
    • Small test cases for incomplete LU-factorization, showing their correctness
    • Documentation of incomplete LU-factorization types
    • Comprehensive test cases for all types of incomplete LU-factorization types (also  proofing compatibility to MATLAB)
    • (optional) Complex version for the incomplete-factorizations
    All in all I'm satisfied with the first part of this GSoC. I wrote about 2374 SLOC (rough wc -l estimate), patched two libraries and modified some config files. The major part of my work except the integration into the Octave build system can be downloaded from here to try it out without needing a development version.

    Going from the bottom up in the picture, I'm not an expert, but very familiar with ITSOL and ZITSOL now. I created patches for both libraries (also included into my archive above) that one can use them on a unix-like system, but so far they are only tested on my *buntu systems. A special part of this work is, that I patched a whole ZILUC function into ZITSOL, which should be used with care. All other complex version are mainly implemented by switching the data type. For small examples this approach works at ILUC as well, but some differences where observable.

    With patched libraries one can go an level up, where I created interfaces to use the most relevant function from the preconditioner libraries. For mastering this task I needed to get to know the different matrix storage formats of Octave and (Z)ITSOL and had some trouble with the matrix conversion. Building upon Wei Jins initial work I successfully interfaced ILUK, ILUT and ILUC + their complex versions. They all deliver comparable results to Matlab. Sadly the authors of (Z)ITSOL seem to be busy at this moment, so I still don't have any clue how to get ILUTP to run. This will tried to be solved in the second part of GSoC, too.

    When you have an interface, it still needs to be integrated in the big Octave code base. Therefore I got familiar with the Octave build system and made all the sufficient changes, including checking for the presence of ITSOL and ZITSOL and making sure that all necessary files are build by default.

    On the last level there is "only" one m-file ensuring the compatibility to MATLABs ilu-function. As one can see from the picture, I couldn't make everything green because of two reasons. First ILUTP is missing and second I cannot include an efficient MILU (modified ILU, see without changing the code of ITSOL entirely. The following two examples work in Octave and MATLAB the same, especially the last example reveals the benefit of preconditioning.

    Example 1

    A = gallery('neumann', 1600) + speye(1600);
    setup.type = 'nofill';
    ans = 7840

    ans = 126478

    ans = 7840

    Example 2

    A = gallery ('wathen', 10, 10);
    b = sum (A,2);
    tol = 1e-8;
    maxit = 50;
    opts.type = 'crout';
    opts.droptol = 1e-4;
    [L, U] = ilu (A, opts);
    x = bicg (A, b, tol, maxit, L, U);
    norm(A * x - b, inf)

    All written functions contain much documentation, callable with doc ilu* and comprehensive test cases, which you can try out from the archive with
    test ilu
    test src/iluc
    test src/iluk
    test src/ilut
    because *.oct file and source file are in separate directories.

    Besides coding, I additionally had the pleasure to join this years OctConf in Milan. I was given the opportunity to know better the community and held a short presentation about my project.

    Finally I want to thank the Octave community for supporting me a lot during the initial part of my project work and I really look forward for the second part and the time after this!

      by Kai Torben Ohlhus ( at August 20, 2013 03:48 AM

      Yet another repository

      Thanks to Jordi, hereby I announce a new Mercurial-repository for the development of my GSoC project. It has a public web interface:

      Clone my whole repository:
          hg clone

      If you have already an Octave repository and want to give my code a try. First check impact, then pull:
          hg incoming -r kais-work
          hg pull

      There follow some notes mainly for myself, how one can work with that Mercurial-repository.

      See new page:

      Working with bookmarks

      hg bookmark kais-work    # Create bookmark
      hg update -r kais-work   # Switch to my bookmark 
      hg push -B kais-work     # Make it public available

      Stay up to date with recent development changes:

      hg pull
      hg merge default
      hg commit -m "maint: periodic merge with Octave repository"
      hg update -C tip
      hg bookmark -f kais-work
      hg push ssh://

      See new page:

      by Kai Torben Ohlhus ( at August 20, 2013 03:46 AM

      GSoC Part 2

      Now follows the second part of the Google Summer of Code. Here is what I planned to do in that time.

      Goals for the Final Evaluation 2013-09-16 (Monday, "Suggested 'pencils down' date") :

      • Implementations for IC(0) and ICT completed
      • Wrapper file for MATLAB compatibility (ichol.m) finished
      • Small test cases for all incomplete-factorizations
      • Documentation of incomplete LU-factorization types
      • Comprehensive test cases for all types of incomplete-factorization types (also  proofing compatibility to MATLAB)
      • (optional) Complex version for the incomplete-factorizations


      by Kai Torben Ohlhus ( at August 20, 2013 03:22 AM

      August 18, 2013

      Vytautas Jančauskas

      Further Development

      Since we already have a working version of the audioplayer and audiorecorder functionality that allows one to write callbacks in Octave here is my plan for further development. There are some issues to be ironed out but I think the interface and functionality will stay more or less the same. We have the firm "pencils down" date set at September 23 by Google.

      1. October 19-25 implement audioread and audioinfo using libsndfile.
      2. October 26-September 1 implement audiowrite.
      3. Use the remainder of the time to integrate everything into Octave proper and fix any issues, write documentation, tests, etc.

      by Vytautas Jančauskas ( at August 18, 2013 09:00 AM

      August 17, 2013

      Vytautas Jančauskas

      Changes to Octave Callback Interface

      The interface to PortAudio that allows you to write callback functions in Octave underwent slight changes. From now on sound is returned in a single Matrix with one or two columns (depending on number of channels) with as many rows as there are audio frames. An example playing back sine waves of different frequencies on the two channels is given below.

      function [ sound, status ] = callback_sine (frames)
        global lphase = 0.0;
        global rphase = 0.0;
        incl = 440.0 / 44100.0;
        incr = 443.0 / 44100.0;
        nl = incl * frames;
        nr = incr * frames;
        left = sin (2.0 * pi * [lphase:incl:lphase+nl]);
        right = sin (2.0 * pi * [rphase:incr:rphase+nr]);
        sound = [left', right'];
        status = 0;
        lphase = lphase + nl;
        rphase = rphase + nr;
      An example of it's use is given below.
      player = audioplayer(@callback_sine);
      # play as long as you want
      It is also possible to use Octave functions to process recorded data. An example that writes recorded data in to a text file is given below.

      function status = callback_record (sound)
        fid = fopen('record.txt', 'at');
        for index = 1:rows(sound)
          fprintf(fid, "%.4f, %.4f\n", sound(index, 1), sound(index, 2));
        status = 0;
      Here is a plot of the resulting file with me clapping three times. Only one channel is plotted.

      by Vytautas Jančauskas ( at August 17, 2013 03:48 PM

      August 15, 2013


      The single Type Support

      Octave use double data type by default, but it also includes support for single precision. Now single precision data type is implemented in JITC as well:

      -------------------- M-file --------------------
      a = single(1);
      b = single(1);

      init = single(0);
      bounds = single(10000);

      i = init;
      j = init;
      a = a + b;
      until (j == bounds)
      until (i == bounds)

      Execution time with only Octave interpreter is 6m52.943s. If we enable JIT, the execution time would be reduce to 0.366s. Let's take a look of its corresponding JITC linear IR and LLVM IR:

      -------------------- Compiling tree --------------------
        j = init;
          a = a + b;
        until(j == bounds)
      until(i == bounds)

      -------------------- octave jit ir --------------------
      body0:        %pred =
              single: i#30 = extract i
              single: bounds#23 = extract bounds
              any: ans#19 = extract ans
              single: b#8 = extract b
              single: a#6 = extract a
              any: j#3 = extract j
              single: init#1 = extract init
              branch: [live] do_until_body2
      do_until_body2:        %pred = body0, phi_split1
              single: init#71 phi | body0 -> single: init#1
                                  | phi_split1 -> single: init#70
              single: i#67 phi | body0 -> single: i#30
                               | phi_split1 -> single: i#36
              single: bounds#64 phi | body0 -> single: bounds#23
                                    | phi_split1 -> single: bounds#66
              single: b#62 phi | body0 -> single: b#8
                               | phi_split1 -> single: b#61
              any: ans#59 phi | body0 -> any: ans#19
                              | phi_split1 -> any: ans#87
              single: a#56 phi | body0 -> single: a#6
                               | phi_split1 -> single: a#12
              single: #2 = call  (single: init#71)
              single: j#4 = single: #2
              branch: [live] do_until_body3
      do_until_body3:        %pred = do_until_body2, do_until_cond_check9
              single: j#73 phi | do_until_body2 -> single: j#4
                               | do_until_cond_check9 -> single: j#18
              single: init#70 phi | do_until_body2 -> single: init#71
                                  | do_until_cond_check9 -> single: init#70
              single: i#69 phi | do_until_body2 -> single: i#67
                               | do_until_cond_check9 -> single: i#69
              single: bounds#66 phi | do_until_body2 -> single: bounds#64
                                    | do_until_cond_check9 -> single: bounds#66
              single: b#61 phi | do_until_body2 -> single: b#62
                               | do_until_cond_check9 -> single: b#61
              single: a#55 phi | do_until_body2 -> single: a#56
                               | do_until_cond_check9 -> single: a#12
              single: #7 = call  (single: a#55)
              single: #9 = call  (single: b#61)
              single: #10 = call binary+ (single: #7, single: #9)
              branch: [live] do_until_body4
      do_until_body4:        %pred = do_until_body3
              single: a#12 = single: #10
              single: #13 = call  (single: j#73)
              single: #14 = call unary++ (single: #13)
              branch: [live] do_until_body5
      do_until_body5:        %pred = do_until_body4
              single: #16 = call  (single: #13)
              branch: [live] do_until_body6
      do_until_body6:        %pred = do_until_body5
              single: j#18 = single: #14
              single: ans#20 = single: #16
              branch: [live] do_until_cond_check7
      do_until_cond_check7:        %pred = do_until_body6
              single: #22 = call  (single: j#18)
              single: #24 = call  (single: bounds#66)
              bool: #25 = call binary== (single: #22, single: #24)
              branch: [live] do_until_cond_check8
      do_until_cond_check8:        %pred = do_until_cond_check7
              bool: #27 = call logically_true (bool: #25)
              branch: [live] do_until_cond_check9
      do_until_cond_check9:        %pred = do_until_cond_check8
              cond_branch: bool: #27, [live] do_until_tail10, [live] do_until_body3
      do_until_tail10:        %pred = do_until_cond_check9
              single: #31 = call  (single: i#69)
              single: #32 = call unary++ (single: #31)
              branch: [live] do_until_tail11
      do_until_tail11:        %pred = do_until_tail10
              single: #34 = call  (single: #31)
              branch: [live] do_until_tail12
      do_until_tail12:        %pred = do_until_tail11
              single: i#36 = single: #32
              single: ans#37 = single: #34
              branch: [live] do_until_cond_check13
      do_until_cond_check13:        %pred = do_until_tail12
              single: #39 = call  (single: i#36)
              single: #40 = call  (single: bounds#66)
              bool: #41 = call binary== (single: #39, single: #40)
              branch: [live] do_until_cond_check14
      do_until_cond_check14:        %pred = do_until_cond_check13
              bool: #43 = call logically_true (bool: #41)
              branch: [live] do_until_cond_check15
      do_until_cond_check15:        %pred = do_until_cond_check14
              cond_branch: bool: #43, [live] do_until_tail16, [live] phi_split1
      phi_split1:        %pred = do_until_cond_check15
              any: #86 = call (any) (single: ans#37)
              any: ans#87 = any: #86
              branch: [live] do_until_body2
      do_until_tail16:        %pred = do_until_cond_check15
              branch: [live] final17
      final17:        %pred = do_until_tail16
              single: j#75 phi | do_until_tail16 -> single: j#18
              single: i#68 phi | do_until_tail16 -> single: i#36
              single: ans#60 phi | do_until_tail16 -> single: ans#37
              store a = single: a#12
              store ans = single: ans#60
              store b = single: b#61
              store bounds = single: bounds#66
              store i = single: i#68
              store init = single: init#70
              store j = single: j#75

      -------------------- llvm ir --------------------
      define void @foobar(%octave_base_value**) {
        %1 = getelementptr inbounds %octave_base_value** %0, i32 0
        %2 = getelementptr inbounds %octave_base_value** %0, i32 1
        %3 = getelementptr inbounds %octave_base_value** %0, i32 2
        %4 = getelementptr inbounds %octave_base_value** %0, i32 3
        %5 = getelementptr inbounds %octave_base_value** %0, i32 4
        %6 = getelementptr inbounds %octave_base_value** %0, i32 5
        %7 = getelementptr inbounds %octave_base_value** %0, i32 6
        br label %body

      body:                                             ; preds = %prelude
        %8 = load %octave_base_value** %1
        %9 = call float @octave_jit_cast_single_any(%octave_base_value* %8)
        %10 = load %octave_base_value** %2
        %11 = call float @octave_jit_cast_single_any(%octave_base_value* %10)
        %12 = load %octave_base_value** %3
        %13 = call %octave_base_value* @id_any(%octave_base_value* %12)
        %14 = load %octave_base_value** %4
        %15 = call float @octave_jit_cast_single_any(%octave_base_value* %14)
        %16 = load %octave_base_value** %5
        %17 = call float @octave_jit_cast_single_any(%octave_base_value* %16)
        %18 = load %octave_base_value** %6
        %19 = call %octave_base_value* @id_any(%octave_base_value* %18)
        %20 = load %octave_base_value** %7
        %21 = call float @octave_jit_cast_single_any(%octave_base_value* %20)
        br label %do_until_body

      do_until_body:                                    ; preds = %phi_split, %body
        %22 = phi float [ %21, %body ], [ %30, %phi_split ]
        %23 = phi float [ %9, %body ], [ %46, %phi_split ]
        %24 = phi float [ %11, %body ], [ %32, %phi_split ]
        %25 = phi float [ %15, %body ], [ %33, %phi_split ]
        %26 = phi %octave_base_value* [ %13, %body ], [ %52, %phi_split ]
        %27 = phi float [ %17, %body ], [ %37, %phi_split ]
        %28 = call float @id_single(float %22)
        br label %do_until_body1

      do_until_body1:                                   ; preds = %do_until_cond_check6, %do_until_body
        %29 = phi float [ %28, %do_until_body ], [ %39, %do_until_cond_check6 ]
        %30 = phi float [ %22, %do_until_body ], [ %30, %do_until_cond_check6 ]
        %31 = phi float [ %23, %do_until_body ], [ %31, %do_until_cond_check6 ]
        %32 = phi float [ %24, %do_until_body ], [ %32, %do_until_cond_check6 ]
        %33 = phi float [ %25, %do_until_body ], [ %33, %do_until_cond_check6 ]
        %34 = phi float [ %27, %do_until_body ], [ %37, %do_until_cond_check6 ]
        %35 = call float @id_single(float %34)
        %36 = call float @id_single(float %33)
        %37 = call float @"octave_jit_+_single"(float %35, float %36)
        br label %do_until_body2

      do_until_body2:                                   ; preds = %do_until_body1
        %38 = call float @id_single(float %29)
        %39 = call float @"octave_jit_++2"(float %38)
        br label %do_until_body3

      do_until_body3:                                   ; preds = %do_until_body2
        %40 = call float @id_single(float %38)
        br label %do_until_body4

      do_until_body4:                                   ; preds = %do_until_body3
        br label %do_until_cond_check

      do_until_cond_check:                              ; preds = %do_until_body4
        %41 = call float @id_single(float %39)
        %42 = call float @id_single(float %32)
        %43 = call i1 @"octave_jit==_single"(float %41, float %42)
        br label %do_until_cond_check5

      do_until_cond_check5:                             ; preds = %do_until_cond_check
        %44 = call i1 @id_bool(i1 %43)
        br label %do_until_cond_check6

      do_until_cond_check6:                             ; preds = %do_until_cond_check5
        br i1 %44, label %do_until_tail, label %do_until_body1

      do_until_tail:                                    ; preds = %do_until_cond_check6
        %45 = call float @id_single(float %31)
        %46 = call float @"octave_jit_++2"(float %45)
        br label %do_until_tail7

      do_until_tail7:                                   ; preds = %do_until_tail
        %47 = call float @id_single(float %45)
        br label %do_until_tail8

      do_until_tail8:                                   ; preds = %do_until_tail7
        br label %do_until_cond_check9

      do_until_cond_check9:                             ; preds = %do_until_tail8
        %48 = call float @id_single(float %46)
        %49 = call float @id_single(float %32)
        %50 = call i1 @"octave_jit==_single"(float %48, float %49)
        br label %do_until_cond_check10

      do_until_cond_check10:                            ; preds = %do_until_cond_check9
        %51 = call i1 @id_bool(i1 %50)
        br label %do_until_cond_check11

      do_until_cond_check11:                            ; preds = %do_until_cond_check10
        br i1 %51, label %do_until_tail12, label %phi_split

      phi_split:                                        ; preds = %do_until_cond_check11
        %52 = call %octave_base_value* @octave_jit_cast_any_single(float %47)
        br label %do_until_body

      do_until_tail12:                                  ; preds = %do_until_cond_check11
        br label %final

      final:                                            ; preds = %do_until_tail12
        %53 = phi float [ %39, %do_until_tail12 ]
        %54 = phi float [ %46, %do_until_tail12 ]
        %55 = phi float [ %47, %do_until_tail12 ]
        %56 = call %octave_base_value* @octave_jit_cast_any_single(float %37)
        store %octave_base_value* %56, %octave_base_value** %5
        %57 = call %octave_base_value* @octave_jit_cast_any_single(float %55)
        store %octave_base_value* %57, %octave_base_value** %3
        %58 = call %octave_base_value* @octave_jit_cast_any_single(float %33)
        store %octave_base_value* %58, %octave_base_value** %4
        %59 = call %octave_base_value* @octave_jit_cast_any_single(float %32)
        store %octave_base_value* %59, %octave_base_value** %2
        %60 = call %octave_base_value* @octave_jit_cast_any_single(float %54)
        store %octave_base_value* %60, %octave_base_value** %1
        %61 = call %octave_base_value* @octave_jit_cast_any_single(float %30)
        store %octave_base_value* %61, %octave_base_value** %7
        %62 = call %octave_base_value* @octave_jit_cast_any_single(float %53)
        store %octave_base_value* %62, %octave_base_value** %6
        ret void

      -------------------- optimized llvm ir --------------------
      define void @foobar(%octave_base_value**) {
        %1 = getelementptr inbounds %octave_base_value** %0, i64 1
        %2 = getelementptr inbounds %octave_base_value** %0, i64 2
        %3 = getelementptr inbounds %octave_base_value** %0, i64 3
        %4 = getelementptr inbounds %octave_base_value** %0, i64 4
        %5 = getelementptr inbounds %octave_base_value** %0, i64 5
        %6 = getelementptr inbounds %octave_base_value** %0, i64 6
        %7 = load %octave_base_value** %0, align 8
        %8 = call float @octave_jit_cast_single_any(%octave_base_value* %7)
        %9 = load %octave_base_value** %1, align 8
        %10 = call float @octave_jit_cast_single_any(%octave_base_value* %9)
        %11 = load %octave_base_value** %3, align 8
        %12 = call float @octave_jit_cast_single_any(%octave_base_value* %11)
        %13 = load %octave_base_value** %4, align 8
        %14 = call float @octave_jit_cast_single_any(%octave_base_value* %13)
        %15 = load %octave_base_value** %6, align 8
        %16 = call float @octave_jit_cast_single_any(%octave_base_value* %15)
        br label %do_until_body1

      do_until_body1:                                   ; preds = %prelude, %phi_split, %do_until_body1
        %17 = phi float [ %21, %do_until_body1 ], [ %16, %prelude ], [ %16, %phi_split ]
        %18 = phi float [ %18, %do_until_body1 ], [ %8, %prelude ], [ %23, %phi_split ]
        %19 = phi float [ %20, %do_until_body1 ], [ %14, %prelude ], [ %20, %phi_split ]
        %20 = fadd float %12, %19
        %21 = fadd float 1.000000e+00, %17
        %22 = fcmp ueq float %21, %10
        br i1 %22, label %do_until_tail, label %do_until_body1

      do_until_tail:                                    ; preds = %do_until_body1
        %23 = fadd float 1.000000e+00, %18
        %24 = fcmp ueq float %23, %10
        br i1 %24, label %final, label %phi_split

      phi_split:                                        ; preds = %do_until_tail
        %25 = call %octave_base_value* @octave_jit_cast_any_single(float %18)
        br label %do_until_body1

      final:                                            ; preds = %do_until_tail
        %26 = call %octave_base_value* @octave_jit_cast_any_single(float %20)
        store %octave_base_value* %26, %octave_base_value** %4, align 8
        %27 = call %octave_base_value* @octave_jit_cast_any_single(float %18)
        store %octave_base_value* %27, %octave_base_value** %2, align 8
        %28 = call %octave_base_value* @octave_jit_cast_any_single(float %12)
        store %octave_base_value* %28, %octave_base_value** %3, align 8
        %29 = call %octave_base_value* @octave_jit_cast_any_single(float %10)
        store %octave_base_value* %29, %octave_base_value** %1, align 8
        %30 = call %octave_base_value* @octave_jit_cast_any_single(float %23)
        store %octave_base_value* %30, %octave_base_value** %0, align 8
        %31 = call %octave_base_value* @octave_jit_cast_any_single(float %16)
        store %octave_base_value* %31, %octave_base_value** %6, align 8
        %32 = call %octave_base_value* @octave_jit_cast_any_single(float %21)
        store %octave_base_value* %32, %octave_base_value** %5, align 8
        ret void

      by L Y.H. ( at August 15, 2013 12:11 PM

      August 14, 2013

      Marco Vassallo

      update # 13


      During the last weeks, following the time-line presented in the midterm post, I have worked on some new examples and I have made some improvements to the code. 
      In particular the major changes that I have made to the code are:
      • A new naming convention which hopefully makes the user interface closer to the one available within Dolfin (but I still need to rename some)
      • A new way of assembling the matrix. It seems that there is no way to get the number of non zero entries of the matrix, as you can read here. Thus, instead of using a "for" cycle over all the elements of the matrix, now I use the getrow() command from dolfin. You can give a quick look at the code here (any suggestion is welcome).
      • The possibility to build an Expression using a multidimensional function-handler.
      • The possibility to use the function Function() to extract a scalar field from a vector one. 
      •  We have split the construction of the form into two steps:
        1. We set all the coefficients of the form using the function which we create on the fly. They will be named ProblemName_BilinearForm or ProblemName_LinearForm.
        2. Then we apply specific BC to the form using the assemble() function and we get back the matrix. If we are assembling the whole system and we want to keep the symmetry of the matrix (if any), we can instead use the command assemble_system (). Finally, if we are solving a non-linear problem and we need to apply essential BC, we should provide to the function also the vector with the tentative solution in order to modify the entries corresponding to the boundary values. This will be illustrated below in the HyperElasticity example.
      With the following examples, we can see directly in action the new features and understand how they work.
      • Navier-Stokes: we learn how to deal with a vector-field problem and how we can save the solution using the fem_save () function. We also use the fem pkg to generate a mesh using gmesh.
      • Mixed-Poisson: we solve the Poisson problem presented in the previous posts using a mixed formulation, and we see how we can extract a scalar field from a vector one.
      • HyperElasticity: we exploit the fsolve () command to solve a non-linear problem. In particular, we see how to use the assemble() function to apply BC also in this situation.
      • Advection-Diffusion: we solve a time dependent problem using the lsode () command and save the solution using the pkg flp.
      For each problem, we refer the reader to the complete desciption on FEniCS or bim web-page, while here we highlight only the  implementation detail relevant for our pkg.


      The linear problem is described here, while the source files are available there. Below there is a video representing the solution

      UFL CODE

      # Copyright (C) 2010 Anders Logg
      # Tentative Velocity 

      # Define function spaces (P2-P1)
      V = VectorElement("CG", triangle, 2)
      Q = FiniteElement("CG", triangle, 1)

      # Define trial and test functions
      u = TrialFunction(V)
      v = TestFunction(V)

      # Define coefficients
      k  = Constant(triangle)
      u0 = Coefficient(V)
      f  = Coefficient(V)
      nu = 0.01

      # Define bilinear and linear forms
      eq = (1/k)*inner(u - u0, v)*dx + inner(grad(u0)*u0, v)*dx + \
          nu*inner(grad(u), grad(v))*dx - inner(f, v)*dx
      a  = lhs(eq)
      L  = rhs(eq)

      # Pressure update 

      # Define function spaces (P2-P1)
      V = VectorElement("CG", triangle, 2)
      Q = FiniteElement("CG", triangle, 1)

      # Define trial and test functions
      p = TrialFunction(Q)
      q = TestFunction(Q)

      # Define coefficients
      k  = Constant(triangle)
      u1 = Coefficient(V)

      # Define bilinear and linear forms
      a = inner(grad(p), grad(q))*dx
      L = -(1/k)*div(u1)*q*dx

      # Velocity update 

      # Define function spaces (P2-P1)
      V = VectorElement("CG", triangle, 2)
      Q = FiniteElement("CG", triangle, 1)

      # Define trial and test functions
      u = TrialFunction(V)
      v = TestFunction(V)

      # Define coefficients
      k  = Constant(triangle)
      u1 = Coefficient(V)
      p1 = Coefficient(Q)

      # Define bilinear and linear forms
      a = inner(u, v)*dx
      L = inner(u1, v)*dx - k*inner(grad(p1), v)*dx

      Code for the domain using the msh pkg

      name = [tmpnam ".geo"];
      fid = fopen (name, "w");
      fputs (fid,"Point (1)  = {0, 0, 0, 0.1};\n");
      fputs (fid,"Point (2)  = {1, 0, 0, 0.1};\n");
      fputs (fid,"Point (3)  = {1, 0.5, 0, 0.1};\n");
      fputs (fid,"Point (4)  = {0.5, 0.5, 0, 0.1};\n");
      fputs (fid,"Point (5) = {0.5, 1, 0, 0.1};\n");
      fputs (fid,"Point (6) = {0, 1, 0,0.1};\n");

      fputs (fid,"Line (1)  = {5, 6};\n");
      fputs (fid,"Line (2) = {2, 3};\n");

      fputs (fid,"Line(3) = {6,1,2};\n");
      fputs (fid,"Line(4) = {5,4,3};\n");
      fputs (fid,"Line Loop(7) = {3,2,-4,1};\n");
      fputs (fid,"Plane Surface(8) = {7};\n");

      fclose (fid);
      msho = msh2m_gmsh (canonicalize_file_name (name)(1:end-4),"scale", 1,"clscale", .2);
      trimesh (msho.t(1:3,:)', msho.p(1,:)', msho.p(2,:)');
       unlink (canonicalize_file_name (name));

      Octave script

      pkg load msh
      pkg load fem-fenics

      fem_init_env ();

      fem_create_all ("TentativeVelocity");
      fem_create_all ("VelocityUpdate");
      fem_create_all ("PressureUpdate");

      # We use the fem pkg and the script above to generate the L-shape domain
      run lshape-domain.m;
      mshd = Mesh (msho);

      V = VelocityUpdate_FunctionSpace (mshd);
      Q = PressureUpdate_FunctionSpace (mshd);

      # no slip BC inside
      bcnoslip = DirichletBC (V, @(x,y) [0;0], [3,4]);
      # constant pressure at the outflow
      bcout = DirichletBC (Q, @(x,y) 0, 2);

      t = 0;
      dt = 0.01;
      T = 3.;

      k = Expression ('k', @(x,y) dt);
      f = Expression ('f', @(x,y) [0; 0]);
      u0 = Expression ('u0', @(x,y) [0; 0]);
      p0 = Expression ('p1', @(x,y) 0);

      a1 = TentativeVelocity_BilinearForm (V, k);
      A1 = assemble (a1, bcnoslip);
      a3 = VelocityUpdate_BilinearForm (V);
      A3 = assemble (a3, bcnoslip);

      # solve the problem for each time step
      # We use the Chorin-Temam algorithm

      i = 0;
      while t < T
        t += dt;
        # time dependent pressure at the inflow
        bcin = DirichletBC (Q, @(x,y) sin(3.0*t), 1);

        L1 = TentativeVelocity_LinearForm (V, k, u0, f);
        b1 = assemble (L1, bcnoslip);
        utmp = A1 \ b1;

        u1 = Function ('u1', V, utmp);

        # we need to assemble the system at each time-step because of time dependent BC
        a2 = PressureUpdate_BilinearForm (Q);
        L2 = PressureUpdate_LinearForm (Q,u1, k);
        [A2, b2] = assemble_system (a2, L2, bcin, bcout);

        ptmp = A2 \ b2;
        p1 = Function ('p1', Q, ptmp);

        L3 = VelocityUpdate_LinearForm (V, k, u1, p1);
        b3 = assemble (L3, bcnoslip);

        ut = A3 \ b3;
        u0 = Function ('u0', V, ut);

        # We use the fem_save () function to store the solution in vtu format
        name = sprintf ("u_%3.3d", ++i);
        fem_save (u0, name);
        delete ([name ".pvd"]);



      The linear problem is described here, while the source files are available there.

      Ufl code 

      # Copyright (C) 2006-2010 Anders Logg and Marie E. Rognes
      BDM = FiniteElement("BDM", triangle, 1)
      DG  = FiniteElement("DG", triangle, 0)
      W = BDM * DG

      (sigma, u) = TrialFunctions(W)
      (tau, v)   = TestFunctions(W)

      CG = FiniteElement("CG", triangle, 1)
      f = Coefficient(CG)

      a = (dot(sigma, tau) + div(tau)*u + div(sigma)*v)*dx
      L = - f*v*dx 

      Octave script

      pkg load msh
      pkg load fem-fenics
      fem_init_env ();

      problem = "MixedPoisson";
      fem_create_all (problem);
      x = y = linspace (0, 1, 32);
      msho = msh2m_structured_mesh (x, y, 1, 1:4);
      mshd = Mesh (msho);

      V = MixedPoisson_FunctionSpace (mshd);

      bc1 = DirichletBC (V, @(x,y) [0; -sin(5.0*x); 0], 1);
      bc2 = DirichletBC (V, @(x,y) [0;  sin(5.0*x); 0], 3);

      f = Expression ('f', @(x,y) 10*exp(-((x - 0.5)^2 + (y - 0.5)^2) / 0.02));

      a = MixedPoisson_BilinearForm (V);
      L = MixedPoisson_LinearForm (V, f);

      [A, b] = assemble_system (a, L, bc1, bc2);

      u = A \ b;

      # We use Function() in two different ways: 
      # if the second argument is a FunctionSpace, then we assemble an object using the vector specified as third argument 
      # if the second argument is a Function, we extract the components of the vector field  specified as third argument
      func = Function ('u', V, u);
      uu = Function ('u', func, 0);
      sigma = Function ('sigma', func, 1);

      fem_plot (sigma);
      fem_plot (uu);


      Hyper Elasticity

      The non-linear problem is described here, while the source files are available there.

      UFL CODE

      # Copyright (C) 2009-2010 Harish Narayanan and Garth N. Wells
      # Function spaces
      element = VectorElement("Lagrange", tetrahedron, 1)

      # Trial and test functions
      du = TrialFunction(element)     # Incremental displacement
      v  = TestFunction(element)      # Test function

      # Functions
      u = Coefficient(element)        # Displacement from previous iteration
      B = Coefficient(element)        # Body force per unit volume
      T = Coefficient(element)        # Traction force on the boundary

      # Kinematics
      I = Identity(element.cell().d)  # Identity tensor
      F = I + grad(u)                 # Deformation gradient
      C = F.T*F                       # Right Cauchy-Green tensor

      # Invariants of deformation tensors
      Ic = tr(C)
      J  = det(F)

      # Elasticity parameters
      mu    = Constant(tetrahedron)
      lmbda = Constant(tetrahedron)

      # Stored strain energy density (compressible neo-Hookean model)
      psi = (mu/2)*(Ic - 3) - mu*ln(J) + (lmbda/2)*(ln(J))**2

      # Total potential energy
      Pi = psi*dx - inner(B, u)*dx - inner(T, u)*ds

      # First variation of Pi (directional derivative about u in the direction of v)
      F = derivative(Pi, u, v)

      # Compute Jacobian of F
      J = derivative(F, u, du)

      Octave Script

      pkg load msh
      pkg load fem-fenics

      fem_init_env ();
      problem = 'HyperElasticity';
      fem_create_all (problem);

      x = y = z = linspace (0, 1, 16);
      msho = msh3m_structured_mesh (x, y, z, 1, 1:6);
      mshd = Mesh (msho);

      # As a general rule, the operator :: of C++ is replaced with _ .
      # Thus, instead of HyperElasticity::FunctionSpace, we have the 
      # function HyperElasticity_FunctionSpace.
      V = HyperElasticity_FunctionSpace (mshd);

      # To enforce essential BCs on a multi-dimensional problem
      # we can use a function handle which specifies the value of the vector
      # Be aware that there are no default values for the entries, and thus
      # DirichletBC (V, @(x,y,z) [0; 0], 1) is different from
      # DirichletBC (V, @(x,y,z) [0; 0; 0], 1);
      # In this example, the surface "1" is the left side, while "2" is the right side of the cube.

      bc1 = DirichletBC (V, @(x,y,z) [0; 0; 0], 1);
      bc2 = DirichletBC (V, @(x,y,z) [0; ...
         0.5 * (0.5 + (y - 0.5) * cos (pi / 3) - (z - 0.5) * sin (pi/3) - y); ...
         0.5 * (0.5 + (y - 0.5) * sin (pi / 3) + (z - 0.5) * cos(pi/3) - z)], 2);

      # We can now specify the values for the coefficient declared in the .ufl file
      B = Expression ('B', @(x,y,z) [0.0; -0.5; 0.0]);
      T = Expression ('T', @(x,y,z) [0.1; 0.0; 0.0]);
      E = 10.0;
      nu = 0.3;
      mu = Expression ('mu', @(x,y,z) E./(2*(1+nu)));
      lmbda = Expression ('lmbda', @(x,y,z) E*nu./((1+nu)*(1-2*nu)));
      u = Expression ('u', @(x,y,z) [0; 0; 0]);

      # The residual form F is in fact a LinearForm, and can thus be created using the usual function. 
      # The same holds for the JacobianForm J, which is a bilinear form.
      L = HyperElasticity_LinearForm (V, mu, lmbda, B, T, u);

      # We can now solve the non-linear problem using fsolve(). 

      # Create the initial guess for fsolve
      u0 = assemble (L, bc1, bc2);

      # Specify the function used to compute the residual and the Jacobian
      fs = @(xx) f (xx, V, bc1, bc2, B, T, mu, lmbda);

      # Solve the problem
      [xx, fval, info] = fsolve (fs, u0, optimset ("jacobian", "on"));

      func = Function ('u', V, xx);
      fem_plot (func);

      Function for the residual

      function [y, jac] = f (xx, V, bc1, bc2, B, T, mu, lmbda)

         u = Function ('u', V, xx);

         a = HyperElasticity_BilinearForm (V, mu, lmbda, u);
         L = HyperElasticity_LinearForm (V, mu, lmbda, B, T, u);

      # Here we have to apply carefully the BC. 
      # If the assemble function is called with two output arguments, then it is expected that also a vector containing the current solution of the problem is given as 2nd argument. 
      # If instead we call the assemble_system with three ouput arguments, the vector has to be passed as 3rd argument.
         if (nargout == 1)
         [y, xx] = assemble (L, xx, bc1, bc2);
         elseif (nargout == 2)
         [jac, y, xx] = assemble_system (a, L, xx, bc1, bc2);




      The problem is described here, while the source files are available there

      Ufl code

      # Advection-Diffusion
      element = FiniteElement("Lagrange", tetrahedron, 1)

      u = TrialFunction(element)
      v = TestFunction(element)

      mu = Coefficient(element)
      phi = Coefficient(element)

      a = mu*inner(grad(u), grad(v))*dx - u * inner (grad(phi), grad (v))*dx

      # Mass matrix

      element = FiniteElement("Lagrange", tetrahedron, 1)

      u = TrialFunction(element)
      v = TestFunction(element)

      a = u*v*dx

      Octave Script

      pkg load msh
      pkg load fpl
      pkg load fem-fenics

      fem_init_env ();

      # As we need only the rhs we create only this one using the specific function
      problem = 'Advection_Diffusion';
      fem_create_rhs (problem);

      problem = 'Reaction';
      fem_create_rhs (problem);
      fem_create_fs (problem);

      x = linspace (0, 1, 40);
      y = z = linspace (0, 1, 20);
      msho = msh3m_structured_mesh (x, y, z, 1, 1:6);
      mshd = Mesh (msho);

      x = msho.p(1, :).';
      y = msho.p(2, :).';
      z = msho.p(3, :).';

      x0 = .2; sx = .1;
      y0 = .2; sy = .1;
      z0 = .8; sz = .1;

      u = exp (- ((x-x0)/(2*sx)) .^2 - ((y-y0)/(2*sy)) .^2 - ((z-z0)/(2*sz)) .^2);

      V  = Reaction_FunctionSpace (mshd);

      f = Expression ('phi', @(x,y,z) x + y -z);
      g = Expression ('mu', @(x,y,z) 0.01);

      a = Advection_Diffusion_BilinearForm (V, f, g);
      m = Reaction_BilinearForm (V);
      A = assemble (a);
      M = assemble (m);

      # Lumped reaction matrix; lumping the matrix, we stabilize the algorithm and make the resolution of the system easier

      ML = diag (sum (M, 1));

      function du = f (u, t, A, M)
        du = - M \ (A * u);

      # We use the lsode command with adams integration method 
      time = linspace (0, 1, 30);
      lsode_options ("integration method", "adams");
      U = lsode (@(u, t) f(u, t, A, ML), u, time);

      #Instead of using the fem_save() command as previously done, we can use the fpl-pkg
      for ii = 1:1:numel (time)
        name = sprintf ("u_%3.3d", ii);
        delete ([name ".vtu"]);
        fpl_vtk_write_field (name, msho, {U(ii,:)', 'u'}, {}, 1);

      by Gedeone ( at August 14, 2013 05:51 AM

      August 06, 2013

      Andrej Lojdl

      New timeline - part 2

      The evaluation period is over. And the second part of Google Summer of Code has started. So here is the second part of timeline for my project. 

      Week 1  -  [ 05.08 ÷ 09.08] - Upgrade on-screen rendering

      The proof of concept have some flaws that need to be fixed and generally Latex interpreter upgraded. First thing is to change image format from PPM to some format enabling GhostScript to store information about transparency. Focus is set to BMP format. It's part of output devices in GS and it's not compressing so it's relatively easy to extract raw data from this file format. The next thing is to shrink image and improve image quality. This part will be done with ImageMagic and GS. There are several additional nice upgrades that are not necessary, like piping dvips and GS , changing font and font size. This will be worked on depending on time.

      Week 2  -  [ 12.08 ÷ 16.08 ] - Figure out how the printing is working

        Until this week there will be functional and fairly good Latex interpreter for on-screen rendering. The plan is to implement this interpreter to printing side. First step of implementation is discovering how exactly printing side works and plan how to implement Latex interpreter.

      Week 3/4  -  [ 19.08 ÷ 30.08 ] - Implement printing side

        This two weeks should be used to implement and fully test Latex interpreter on printing side. For printing EPS image (vector) format will be used. Prining side is responsible for saving plots to. This will enable user to save plot with fancy Latex markups and present his results to other people.

      Week 5  -  [ 02.08 ÷ 06.08 ] - Configuring interpreter functionality

        It would be nice to check if user have all required binaries to run Latex interpreter. And if not, this interpreter should be disabled and proper information printed into terminal to user. Probably explain how to solve this and what exactly he need to have on his system for this to work properly.

      Week 6  -  [ 09.08 ÷ 13.08 ] - Expand to all platforms

        All development is done on Ubuntu 12.10 . Last week of coding period is right time to check how this new functionality is working on others system. After testing the code should be modified. Because programs running in background enabling interpreter have different names and behave differently on different systems.

      by AndrejLojdl ( at August 06, 2013 08:10 AM

      August 05, 2013

      Vytautas Jančauskas

      Writing Portaudio Callbacks in Octave

      It is now possible to have Octave functions supply audio data to be played by audioplayer objects. Stereo and mono playback is supported, the function must supply a specified number of audio frames and make sure to return a valid PortAudio status value (0 for continue, 1 for completed and 2 for abort). For example, put the following in a file called callback_noise.m:

      function [ left, right, status ] = callback_noise (frames)
      left = randn (1, frames) - 0.5;
      right = randn (1, frames) - 0.5;
      status = 0;

      You can now use this as a callback as shown below.

      player = audioplayer ('callback_noise', 44100);
      play (player);
      # play for as much you like
      stop (player);

      This will play white noise on both channels, which is not really interesting. To make matters more interesting we can use global variables to store state between callback calls. The following will play sine waves of different frequencies in both channels, place it in a file called callback_sine.m:

      function [ left, right, status ] = callback_sine (frames)
      global lphase = 0.0;
      global rphase = 0.0;
      incl = 440.0 / 44100.0;
      incr = 443.0 / 44100.0;
      nl = incl * frames;
      nr = incr * frames;
      left = sin (2.0 * pi * [lphase:incl:lphase+nl]);
      right = sin (2.0 * pi * [rphase:incr:rphase+nr]);
      status = 0;
      lphase = lphase + nl;
      rphase = rphase + nr;

      Both of these examples will play for as long as you don't use the stop method on the player object you created. This shows an advantage of this mode of operation over having to store all data in memory at one time. This way only small portion of audio has to be stored at any given time.

      During audioplayer object construction all the parameters work the same way as when passing audio data. You can supply bit depth and device ID in addition to sampling frequency if you want to.

      by Vytautas Jančauskas ( at August 05, 2013 01:05 PM

      August 04, 2013

      Kai Torben Ohlhus

      Mercurial commit message cleaning

      Now the midterm is over, one can already think about to make my code ready to be pushed to the official Savannah repository of GNU/Octave. So on the maintainers-mailing list there was a discussion about how to make my changesets fit for the official repository.

      Again thanks to Jordi I really learned a lot, how to "manipulate" the Mercurial history. As a first step one should get Mercurial 2.7, because older versions might not support all features used in this post. The current version from the Debian repository does not suffice.

      A good point to start is getting a list of all my changesets:

        hg log --user k.ohlhus

      Now my first change in current development is, that I no longer want to merge the Savannah repository into mine. Instead I got to know about the rebase command from Mercurial, which sets all your nonofficial changesets on top of the history, that makes it easier to track what has changed in my repository.

      The old state of my repository is very good explained in Scenario B of the rebase manual page. To fix that I had to run a mix of two commands for all my merges, e.g.:

        hg phase --draft MERGE_CHANGESET --force
        hg rebase --source MERGE_CHANGESET --dest LAST_CHANGESET_FROM_SAVANNAH
      From now all my changes are on top. The second task of cleaning is to tidy up my commit messages, as I did not stick to the commit message guidelines of Octave. By typing

        hg histedit -r LAST_BAD_CHANGESET_OF_MINE

      I can create an edit plan for my changeset commit messages and carry this plan out.

      edit plan example:

      mess a623d25a6b42 17121 Startet fixing the transposition problem with all ILU. S
      pick 466fb4342423 17122 Interface change in itsol_util.h and further addition of
      pick 5afd5d5f42d0 17123 Added MATLAB compatibility wrapper for ILU-factorization
      pick 0dbc3caefb6e 17124 Integrating ZITSOL-library to Octave build system.
      pick e0b86073ce45 17125 Extended ILU-Preconditioner for complex input using ZITS
      pick 9436002c0a0b 17126 Updated test cases. Fixing ILUC output.
      pick 9d4e24a01f42 17127 Test cases for complex case increased.
      pick 1b6ac1f0991b 17128 Test cases for complex case in ilut.
      pick 286aa419eb58 17129 Redirected output of ITSOL and ZITSOL from stderr to tem
      pick 975f30ec1c1b 17130 Make most recent development on ILUTP public. Still not
      pick 65c0db01a558 17131 Final tidy up on Matlab wrapper with tests.
      pick 4244faff69d6 17132 Moved ilu.m from scripts/linear-algebra to scripts/spars
      pick 31a15f423dd8 17133 removed and new build entries.

      # Edit history between a623d25a6b42 and 31a15f423dd8
      # Commands:
      #  p, pick = use commit
      #  e, edit = use commit, but stop for amending
      #  f, fold = use commit, but fold into previous commit (combines N and N-1)
      #  d, drop = remove commit from history
      #  m, mess = edit message without changing commit content

      This tedious, but necessary work has to find its way to my development repository after my journey (13th of August). Then I'll provide instructions how to get  the new changesets and how to remove the older ones. In any case I'm sorry for the inconvenience.

      by Kai Torben Ohlhus ( at August 04, 2013 05:33 AM

      August 03, 2013

      Anirudha Bose


      Having passed the mid-term evaluations of GSoC, it is time to start my work on making Octave available for Mac OS X. I have never used a Mac, so I will utilize the coming days in studying about the build system of Mac OS X. Thanks to the Director of our university for giving me access to a Mac Pro at Mobile Computing Lab. I have full physical access to it till 5pm. After that, I am permitted to use the machine through TeamViewer/SSH/Telnet (I haven’t figured it out yet).

      At this moment, creating a new Mac OS X App Bundle using Macports seems most practical to me. App bundles are directory hierarchies, with the top-level directory having a name that ends with a .app extension. Using Macports, we can produce binary packages with standalone installers that are pre-compiled. The final app bundle doesn’t require Macports to be installed on the target system. All the dependencies will be a part of the bundle. A dmg disk image of the fiinal package is also desirable. The .app bundle can also be run directly like any normal app without any installation. This is just like executing the file bin/octave.exe in Windows version of Octave, where all the dependencies are pre-built.

      There is a real app bundle already existing for Octave 3.2 by Thomas Treichl. I am not sure if working on this will be a good idea. It is very old and I have known that the scripts in it might not work properly for newer versions of Octave, while digging up the mailing list. As of now, using Macports seems a good option to me. It also has some great documentation to help me get started. I also know that Ben Abbott had used this approach to produce working app bundles for Octave. I will discuss about the project with the Octave community and write another blog article about what I am going to do exactly in this second half of GSoC.

      by Anirudha at August 03, 2013 08:50 PM

      Ahsan Ali Shahid

      Agora Update #6 Midterm

      During the past week , there was little for me to work on. My internet connection was broken. This will be fixed on Monday.  In the meantime, I looked to off-line Django documentation and tried to find a way so that a feature of adding hg repos can be created using a script automatically on Agora. As of now , due to broken internet connection, my progress has suffered.

      I had also filled out the midterm evaluation during this time. Now for post mid-term, my objective is to add an auto hg creation feature(needs some advice on that from Jordi) , and to make Agora tollay bug-free(For that I will do and learn a bit of sys-adminning).  This week  the month of “Ramadan” is also coming to an end( 9th Aug probably ),and I have  “Eid” days to attend with my family, so I will take a few days off and afterwards I will focus and will spend more time on completion of the given Project.

      by ahsanali at August 03, 2013 08:20 PM

      August 01, 2013

      Anirudha Bose


      The trick I was using earlier to get Java support in Octave was merely a workaround. My mentor Michael told me that cross-compiling Octave with Java support was not working out of the box. Firstly, including Java support required us to get a local Windows installation of JDK. Only the Windows version of JDK had the file jvm.dll which was checked during configure. Linux version of java (OpenJDK) didn’t have that file. After some discussion, I came to know that jvm.dll is not required to build Octave with Java. Apparently the only files necessary to get Java support was jni.h and the platform dependent jni_md.h.

      To remove jvm.dll checking, I used the following patch:

      diff --git a/src/octave-2-no-jvm-check.patch b/src/octave-2-no-jvm-check.patch
      new file mode 100644
      --- /dev/null
      +++ b/src/octave-2-no-jvm-check.patch
      @@ -0,0 +1,18 @@
      +This file is part of MXE.
      +See index.html for further information.
      +Contains patch to remove checking for jvm.dll during integration of Java support in Octave
      +diff -r bb713af2e1d9
      +--- a/	Tue Jul 30 00:49:37 2013 -0400
      ++++ b/	Thu Aug 01 02:57:25 2013 +0530
      +@@ -2441,9 +2441,6 @@
      +     darwin*)
      +       jvmlib=libjvm.dylib
      +     ;;
      +-    mingw* | cygwin*)
      +-      jvmlib=jvm.dll
      +-    ;;
      +     *)
      +     ;;

      The process of exporting all missing symbols has now been added as a patch. So there is no need to make the change in the original package, get the sha1 checksum and update

      Even though jvm.dll checking was removed, the compiler complained about missing jni_md.h file. OpenJDK had the file jni.h, but it was not enough to build Java enabled Octave. So Michael suggested something much better. He said that it should be possible to download the two header files and put them somewhere into the include directory of mxe-octve (like usr/i686-pc-mingw32/include/java/) and then make Octave to use them. I have pushed a changeset in my repo to do this. Now, one will be able to build Octave with Java enabled without having to hack the –with-java-homedir, –with-java-includedir, and –with-java-libdir options. The using of separate JNI headers during build is limited just to cross-compilation. For native builds, Octave would compile just like it did before. See the new src/ file here.

      by Anirudha at August 01, 2013 12:27 PM

      July 30, 2013

      Carnë Draug

      GSoC 2013: GraphicsMagick (update #4)

      Octave uses GraphicsMagick (GM) for reading and writing of images, which gives us a unique interface to read many different formats through a simple and unique interface. We actually get support for more image formats than Matlab.

      However, I’m starting to think that GM may not be the best option, or even an option at all, if we wish to be Matlab compatible. The reason is that GM does not gives us methods to access information about the actual image that is on file, only about how the image is being stored. I’ll be listing the problems I have faced during the last few weeks while improving image IO in Octave.

      Note that I’m not arguing that GM is bad. We can use it to read pretty much any image, the problem is that what we read may be different from what was in the file. Octave needs to get the original image values and not one of the possible representations of that image. GM is doing a great job at being smart but we may need something more stupid.

      Quantum depth

      It is well know among those that use Octave for reading images, that we’re dependent on the quantum depth used to build GM. The way it works is that GM can be built with quantum 8, 16 and 32 and defines the type it uses to store the image values (uint8, uint16, and uint32 respectively). Independently of the original image values, GM will scale the values for the range of the type. If the file has 8 bit depth, and GM was built with quantum 16, we will need to fix the range (this seems to be one of the few cases where we can get the actual image information using the depth () method). As long as the images have their values as unsigned integers there is no problem but images are not limited to those types. Images with values as floating point become a problem (see bug #39249) since we loose all information about the range. Also, it appears that at least the JPEG 2000 format may use int16 instead of uint16 and I have no idea what to do about that.

      Image class

      GM defines two Image Class types: direct and pseudo class. Direct class is for a “normal” image where the image is “composed of pixels which represent literal color values“. Pseudo class if for indexed images where the image is “composed of pixels which specify an index in a color palette“, or colormap in Octave vernacular. The problem is that the class is not of the original image but based on what GM guesses will increase performance. This leads to some JPEG (a format that has no support for indexed images at all), being reported as PseudoClass probably because they have few unique colors. And we can’t make the decision based on the image format since some (such as TIFF and PNG) support both. If an image from that format reports being pseudo class, we have no method to find if GM is choosing that for performance or if it’s the original file class.

      Image type

      Similarly to the image class, GM also has image types. There’s many more image types which define the image of any class as bilevel, grayscale, RGB, or CMYK, with and without opacity (called matte in GM). Just like it does for the image class, GM guesses which type is better so a file with a RGB image that only uses one of the color channels will report being grayscale, or even bilevel if it only has two values on that channel (see bug #32986 and bug #36820).

      For now, there’s not much we can do. Adding methods to find information about the actual image in file through GM is not trivial and outside the scope of my project. I have already implemented the options I proposed to, the ones that allow reading and writing of multi-dimensional images.

      Fixing the problems of indexed images and alpha channels was nor part of the proposal but it makes sense to do it as I was going through the code. We are now Matlab compatible but in some cases there is nothing we can do, it’s just the way GM works.

      If we wish to be 100% Matlab compatible when reading images, some other strategy is needed. Whether that means improving GM or replace it with other libraries I don’t know. What we need is a C++ library for reading and writing of images in any formats and gives us access to the original image values.

      by carandraug at July 30, 2013 03:11 PM

      Andrej Lojdl


      Octave use freetype library to render text for OpenGL. Trough ft_render class, string is rendered depending on font size, color an other parameters.
      Every text object has interpreter parameter, which is by default set to 'tex'. Although Octave has only limited set of tex functionality implemented. All this is going trough ft_render class, and as a result we have information about text in 8NDArray, which is then used by OpenGL for rendering of rasterized image.

      When upgrading existing system, programmer have to save already existing functionality and add a new one. To add latex markup, there have to be class for it. And normally some dispatch mechanism on top, which can pick what type of render will be used. Solution to this problem came with abstract class and dispatch mechanism inside wrapper. So here is new class organisation.

      text_render present top wrapper class, and inside it there is a dispatch mechanism pointing to one of three classes for text rendering. It also handles memory and use base_text_render class to create and destroy objects, with rep pointer.

      base_text_render is abstract class used as parent class to text rendering classes. It has base virtual methods, named base methods on image presented above this text.

      ft_render is existing class, using freetype to render text. It inherit base methods and has additional methods.  Additional methods computes bounding box, rotate image, change mode of rendering and get extent.

      dummy_render class is empty. Is integrated to this concept because we can use latex if there are additional programs needed for this system. And renderer used by default is useless without installed freetype library. So to get around null rep pointer, this class will be picked and it will be just have some method to print error and help users install freetype or/and latex.

      latex_render present implemented functions already developed in C. They are modified to use C++ libraries and to work as methods inside this class.
       - adapter method - put get input string and then create TEX file for further use.
       - render method - use Latex system and GhostScript as described in earlier post. As final result there is bitmap image. But OpenGL renderer needs 8NDArray, so this part is not finished.
       -  get_bbox method - open EPS file and read bounding box data, which is then transformed to true values

      NOTE: Octave community use interesting concept for memory management. There are some abstract class and then wrapper on top. When someone create object it's not handled directly in further code but rep pointers are used. Here is one example liboctave/util/


      class octave_mutex;





      friend class octave_mutex; 

      octave_base_mutex (void) : count (1) { } 

      virtual ~octave_base_mutex (void) { } 

      virtual void lock (void); 

      virtual void unlock (void); 

      virtual bool try_lock (void); 


      octave_refcount <int>; count;







      octave_mutex (void); 

      octave_mutex (const octave_mutex&; m)

      : rep (m.rep)




      ~octave_mutex (void)


      if (--rep->count == 0)

      delete rep;


      octave_mutex& operator = (const octave_mutex& m)


      if (rep != m.rep)


      if (--rep->count == 0)

      delete rep;

      rep = m.rep;



      return *this;


      void lock (void)


      rep->lock ();


      void unlock (void)


      rep->unlock ();


      bool try_lock (void)


      return rep->try_lock ();



      octave_base_mutex *rep;



      by AndrejLojdl ( at July 30, 2013 11:03 AM

      Mid - term

      And here it is, mid-term evaluation period is already open. And it seems like yesterday, we begin working on our projects. From last post, there explaining how the class text_render look, I was implementing it to existing code.

       Here is a final result , proof of concept latex interpreter. As you can see, it need to be upgraded.

      Here's part of Octave code producing this plot.

      title('Plot with LaTeX','fontsize',28);

      For now I will point only the main points that should be in next upgrade:
      - it should have transparent background
      - quality of image should be higher
      - image should be smaller

         Transparent background 
      Problem with current revision is that it use PPM format for transferring data to pixels member of any render class. It's handy, because Octave can directly open this file. Read image width and height, and then just read data stored in every pixel ( red, green and blue component ). But it has one big defect , alpha channel is missing. So there have to be another format or some kind of trick to get around this shortage of information.

      Current image is rendered form EPS file at 600 dpi. The idea is to render larger image and then scale it to fit original image at Octave plot. With this trick, rendered LaTeX image will be smoother. With better smooth there's better image quality. ( Note: Rendered image in plot presented here is rendered at 300 dpi to produce smaller image. )

         Image size
      After placing image on plot is obvious that it's too big. It should be smaller to fit with other text on plot. As mentioned before we want smaller image then rendered, but with better smoothing so it will be pushed into smaller size to get this result.

      by AndrejLojdl ( at July 30, 2013 10:47 AM

      Marco Vassallo

      update # 12

      Current status of the project and future developments

      In this post I would like to show the work which has been done since last week, to recap all the important implementation details and to give a timeline for the rest of the project.

      Generation of code on the fly

      As it as been shown in the two previous posts, the problems which I presented last week have been partially solved.
      In particular, for the creation on the fly of the code from the header file, Juan Pablo provided me a python code which makes a great job. I have spent some time adapting it to my problem, but when I finally got a working code, we realized that it was probably enough to use the functions available inside Octave because the problem was rather simple. The pyton code is however available here, while the final solution adopted can be found there.
      For the problem related with fem_init_env (), we are still working on it.

      Relevant implementation details

      The relevant implementation details which the user should know are:
      • all the objects are managed using boost::shared_ptr <>. It means that the same resource can be shared by more objects and useless copies should be avoided. For example, if we have two different functional spaces in the same problem, like with Navier-Stokes for the velocity and the pressure, the mesh is shared between them and no one has its own copy.
      • The BC are imposed directly to the mesh setting to zero all the off diagonal elements in the corresponding line. This means that we could loose the symmetry of the matrix, if any. The simmetry is lost because we are using the method apply() of the class dolfin::DirichletBC. We have thus planned to insert later a function which instead of the method apply() calls the function assemble_system() , which preserves the symmetry of the system but which builds together the lhs and the rhs.
      • The coefficient of the variational problem can be specified using either a fem-coefficient or a fem-function. They are different objects which behave in different ways: a fem-coefficient object overloads the eval() method of the  dolfin::Expression class and it is evaluated at run time using the octave function feval (). A fem-function instead doesn't need to be evaluated because it is assembled copying element-by-element the values contained in the input vector.

      Main issues and timeline 

      Below I list the issues that we still have to solve and the feautures that we still have to add to the program
      1. The main issue right now seems to be the creation of the sparse matrix. The problem is that we don't have any easy way to estimate the number of non zero entries in our matrix.
        What I'm doing right now is available here: at the beginning, I allocate a percentage of the maximum number of possible entries (the alpha coefficient in the code), and if this size is reached while filling in the matrix, I use the information available to estimate again what could be the final size of the sparse matrix. The problem is that the initial percentage is chosen without any criteria and that at some points we could have two matrices at the same time in memory.
        The obvious but not very clever solution would be to compute the nz entries making a first cycle over all the elements of the matrix which counts the number of non zero elements. Once that the nz elements are known, we can allocate the matrix and with a second cycle we copy the elements different from zero.
        Another possibility would be to build the vectors fot the indexes of the row  and column and the vector for the value and then to create the matrix using them.
        Maybe we could implement both solutions and use one or the other accordingly to the size of the matrix, but the easiest thing would be to get a compressed matrix representation directly from DOLFIN. I asked for it but I have not got any answer yet; I will try also on the mailing list.
      2. A better naming convention for the functions and a user interface closer to the one available with DOLFIN. I would also like to add some methods which provide useful information about our elements.
      3. Add more examples, and in particular use together the features offered by FEniCS with the great tools available inside Octave. I would like to do some of the following:
        • Study of the generalized eigenvectors for the Stokes probem using the eig () command
        • Show how existing functions in the fpl pkg can be used to export or visualize computation results
        • Show how ODE solvers in Octave can be used in conjunction with dolfin to solve evolutionary problems
        • Study a more complex problem which involves different discrete  spaces like the Darcy-Stokes equation
      4. Add the possibility to solve problems which involve a vector field instead of a scalar one.
      5. Add the possibility to solve problems which involve functionals, for example if we want to estimate the norm of our solution.

      by Gedeone ( at July 30, 2013 04:24 AM