This is what Twitter does when it wants to display a Flickr picture. It makes a query to a service at Twitter, passing it the URL of the page containing the picture. Flickr returns a bundle of info, in JSON or XML. Twitter parses the info, extracts a link to the picture and its height and width and generates an HTML <img> element.
Discovery works more or less the same way it does for RSS. Read the HTML source of the picture, and look for a link element in the head with type equal to application/json+oembed and rel equal to alternate and an href attribute pointing to the JSON or XML struct. It's all there in the oEmbed spec, no need to reproduce it here.
However, it doesn't appear that Flickr is supporting this discovery protocol in their HTML source. There are no link elements with that structure in the head section of this page.
There's a lot of stuff about discovery in the beginning of the document, and it's quite dense, and I don't see why it's there at all. Why not just use the <link> form of discovery. It's rational, there's ample prior art (it's how RSS feeds are discovered), and it's a lot simpler.
Further, why not just include the info you would get back from the query in the HTML source. It's one more web service to keep running. This really is static content that only has to be regenerated when the content changes. That's usually pretty infrequent.