POV-Ray : Newsgroups : povray.off-topic : Apparently it didn't catch on : Re: Apparently it didn't catch on Server Time
28 Jul 2024 20:29:36 EDT (-0400)
  Re: Apparently it didn't catch on  
From: Orchid Win7 v1
Date: 24 Feb 2013 07:43:56
Message: <512a0b0c$1@news.povray.org>
On 23/02/2013 04:17 PM, Orchid Win7 v1 wrote:
> http://research.microsoft.com/en-us/um/people/simonpj/Papers/excel/excel.pdf

For those unable to read the paper, a short summary...

The number of people who use spreadsheets dwarfs the number of 
professional computer programmers. And yet, some of these spreadsheets 
are extremely complex, sophisticated constructions. In the main, these 
are built not by trained programmers, but by accountants, salesmen, 
purchasing associates, and anybody else who has a need to do complicated 
numerical calculations.

A complex spreadsheet can be considered to be a "program", the formulas 
in the cells being its "source code". And yet, accountants don't 
generally sit down and learn to write code in C, C++, Java, Perl, 
Python, R, or even something like Mathematica which is heavily 
problem-oriented rather than machine-oriented. Why is that?

Arguably a spreadsheet is easier to comprehend. You type in a formula 
and it makes a number. You can immediately *see* what is happening. In 
traditional programming languages, you write a bunch of text, press go, 
and maybe some kind of result is produced. It's far less visual, even if 
you have a REPL to use. You don't literally *see* all the intermediate 
calculation steps, unless you fire up a purpose-built debugger.

Which a spreadsheet, there is no separate coding and debugging phase. 
You just have a grid of cells, which you can modify at will, at any 
time, for any reason. This makes it easier to gradually construct a 
sophisticated application from humble beginnings.

When you start to look at a spreadsheet as a program, you realise that 
as a programming language, it's pretty damned weak. In particular, it is 
impossible to define new data types, and it has impossible to define 
subroutines. The *only* way to repeat a calculation on more than one 
input is to literally copy and paste it. (And most spreadsheets have at 
least moderately sophisticated mechanisms for doing intelligent copy and 
pasting.)

As a trained computer programmer, this makes me gasp in horror. It is a 
clear violation of the DRY principle. But as an accountant, or a 
quantity surveyor, you're not going to know or care about such things. 
You're just trying to produce this month's figures (or whatever).

There *is* of course a facility in Excel to define functions. However, 
this requires you to learn Visual Basic, a completely different 
programming language, with radically different semantics. It's a bit 
like the way C lets you write macros, but in a totally different 
language (CPP), whereas Lisp lets you write macros in Lisp.

So, the main thrust of the paper is "what if you could write Excel 
functions in Excel?"

The mechanism they chose is to let you define a special "function 
worksheet". Certain cells are the function's input, and a certain cell 
is the function's output. Every time you call that function, notionally 
a new copy of this function worksheet is dynamically created with the 
appropriate cells filled in, and whatever ends up in the output cell is 
copied back to the formula that mentions the function.

That's the gist of the idea. Naturally, if you edit one instance of the 
function worksheet, the change affects *all* instances, automatically. 
And therein lies the payoff.

Trouble is... Well, actually there's a whole *host* of problems. Problem 
#1 is that your new function might be called twenty billion times. 
You're not going to display twenty billion worksheet tabs! The solution 
they came up with as that only one "instance" of each function sheet can 
be visible at once.

Problem #2 is "how do we support recursion?" They chose the solution 
"you don't". Which seems reasonable; how many people actually want to 
use that feature? Do we really want to complicate the interface further 
just to make those guys happy?

Problem #3 is how to support functions that take *ranges* of cells as 
input. Their solution was to make a cell matrix a first-class value, so 
that a single cell can contain an entire matrix. And since a matrix is 
made up of cells, matrices can be nested arbitrarily deep.

In short, the solution is to UTTERLY BREAK THE SPREADSHEET MODEL! This 
is the least convincing part of the paper for me. The cost of the user 
learning this completely new mechanism seems too high compared to the 
modest payoff of handling ranges of different sizes. I think a better 
approach could be found.

So there we are. There's lots of screenshots, which gives the 
*impression* that they had a working product. In 2003. But in the 
conclusion they explain that these are simply mockups and simulations, 
not a real, usable product. They make it clear that they are in 
discussions with Microsoft about making this feature available in the 
real commercial Excel product (Simon PJ works at Microsoft Research 
Cambridge, after all). But, clearly, this never came to anything.

I think this could very easily be one of those features that nobody 
knows about and nobody ever uses, unless you made it *extremely* simple 
to comprehend and to use. For example, the vast majority of people who 
use Word apparently have no idea what the hell a "mail merge" is, or 
what Word has the capability to do such a thing. (Then again, most 
people probably don't *need* it either...)

I think it would be very cool to define functions in Excel, but I doubt 
Excel will be the one to pioneer this. I'm also not sure how much 
anybody would use it...


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.