 |
 |
|
 |
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Warp wrote:
> Invisible <voi### [at] dev null> wrote:
>> It's news to me that it's possible to access anything web-based with
>> something other than a web browser.
>
> Why would the http protocol be limited to web browsers?
It isn't. I personally have written a small Haskell program that
connects to PassMark website, downloads the current benchmark table, and
gives me a CSV file containing all the current results.
Of course, writing an HTTP client is very nontrivial. If anything
remotely unusual were to happen, my program would hopelessly fall over.
More to the point, there's masses of tricky parsing to wade through all
the presentational HTML to extract the actual raw data I'm after.
(The original plan was to then connect to ebuyer.com and dredge out the
processor prices and availability - but this turned out to be far too hard.)
The fun part is, if PassMark ever change their presentational HTML (or
their URLs), my program will spectacularly break.
I wouldn't really call this "interfacing my program to a website". I'd
call this "a fragile hack which just happens to sort-of work right now".
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
> That's because they are both Micro$oft products.
Not really, it's because there is an agreed standard for how the information
is transferred over http (of course if both sides are made by the same
company this is usually automatically the case).
How about RSS feeds, loads of programs written that access those over the
web?
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
scott wrote:
>> That's because they are both Micro$oft products.
>
> Not really, it's because there is an agreed standard for how the
> information is transferred over http (of course if both sides are made
> by the same company this is usually automatically the case).
Well, let's face it, any system is merely an agreed way to send data
between multiple computers. I'm just saying, I'm not aware of any way of
accessing an arbitrary website programatically without going through a
world of pain.
> How about RSS feeds, loads of programs written that access those over
> the web?
I don't what what RSS is.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
> I don't what what RSS is.
http://tinyurl.com/ct6woq
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
scott wrote:
>> I don't what what RSS is.
>
> http://tinyurl.com/ct6woq
Oh, the Wikipedia article on RSS. I would *never* have thought of that. :-P
Seriously, did you think I hadn't read that already?
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Nicolas Alvarez <nic### [at] gmail com> wrote:
> Darren New wrote:
> > http://asserttrue.blogspot.com/2009/04/api-first-design.html
> >
> > Huh? Doesn't everyone document the APIs before they start writing code
> > that
> > exports APIs? Seriously, don't you programmers do that?
> >
> > I can't imagine how you can even know when you're done a piece of the
> > library if you didn't do the documentation first. I thought that was,
> > like, the only way to do it.
>
> I spent weeks deciding on the API, some more weeks deciding on internal
> design, and I've yet to write any *real code* for this one library... :)
From my experience, a *typical* real-life working mode would be to
(1) design and (optionally) document the API
(2) design and (optionally) document the internal design, modifying the API as
problems with the internal design arise
(3) write the real code, modifying both the internal design and API as problems
with the real code arise
(4) start working on a new project
(5) (optionally) document the changes made to the API in phases (2) and (3)
(6) (optionally) document the changes made to the internal design in phase (3)
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
clipka wrote:
> From my experience, a *typical* real-life working mode would be to
<chortle> I think you hit the nail on the head.
--
Darren New, San Diego CA, USA (PST)
There's no CD like OCD, there's no CD I knoooow!
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Invisible wrote:
> More to the point, there's masses of tricky parsing to wade through all
> the presentational HTML to extract the actual raw data I'm after.
That's because you're getting the results in HTML instead of "via an API".
That's what people are talking about when they talk about web APIs:
presenting data without making you parse it.
"REST" is a pattern for doing this in a way that *also* lets you use a web
browser to reverse-engineer the protocols by looking at it and which is
theoretically kind to intermediate proxies.
"SOAP" is a pattern for doing this in a way that lets you publish the
specification of the interface in a form that a tool can generate code to
decode it into whatever native data structures are available for your language.
What you're doing is called "screen scraping", and yes, it breaks when the
format of the web page changes, which happens often when there's so many
people doing screen scraping that it starts to impact the actual customers
of the web site.
--
Darren New, San Diego CA, USA (PST)
There's no CD like OCD, there's no CD I knoooow!
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Invisible wrote:
> accessing an arbitrary website programatically without going through a
> world of pain.
No, you can't access "arbitrary" websites that present HTML. Well, you can,
depending on what you want to do. Google does it all the time. But it's hard
to get specific information out in an automated way. That's why people
invented HTTP-based APIs like SOAP.
--
Darren New, San Diego CA, USA (PST)
There's no CD like OCD, there's no CD I knoooow!
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> accessing an arbitrary website programatically without going through a
>> world of pain.
>
> No, you can't access "arbitrary" websites that present HTML. Well, you
> can, depending on what you want to do. Google does it all the time.
Uhuh. And how many PhDs work for Google?
> But
> it's hard to get specific information out in an automated way. That's
> why people invented HTTP-based APIs like SOAP.
Why would you base a protocol on HTTP?
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |