The following is a guest post by David Riecks, leader of the Photo Metadata Project.
Storing information about your images inside the image itself provides a number of useful benefits. Digital photographers may refer to this as embedded photo metadata or just metadata for short. For professional photographers it’s an easy way to let potential publishers know they took the photo and how to contact them.
Storing this information inside the image can’t prevent others from misusing the information but it can help others know more about the image: who is pictured in a photo, what they are doing (and maybe why) as well as where and when it was taken. However, all of those benefits are lost if this metadata doesn’t “stick” to the image as it travels from one computer to another and onto the web.
A recent survey by the International Press Telecommunications Council — done as part of the Embedded Metadata Manifesto — was conducted to raise awareness of the problems that can occur when using many of the social media networks and photo sharing sites. This survey shows that a number of the more popular services strip this embedded information from images when the images are uploaded to the services or processed on their servers.
[See also the video of IPTC Managing Director, Michael Steidl, “Do embedded rights metadata of photos survive social media systems?“]
While some comments on news blogs claim that professional photographers are raising this issue because they are only concerned about maintaining copyright or attribution information, it’s not that simple. That is because most methods of storing this information use one of three different storage “containers” within a digital image to hold the information. If copyright or contact information is removed, that almost always means that captions, locations and even the date the photo was taken will be lost as well.
A recent example concerns a controversial photograph that people thought was taken by master photographer, Henri Cartier-Bresson, rather than Andrej Vasilenko, despite the modern clothing and backpack.
However, it’s not just about proper attribution or maintaining your copyright notice, it’s also about preservation. For example, while many people may know some well-known online images (like Dorothea Lange’s “Migrant Mother”), if they downloaded it, would they know where or when it was taken if we lost all the metadata, including the caption? Most documentary images will have little cultural value if we don’t know at least a few of the basic Who, What, Why, When, Where and How’s of the image in front of us.
One of my neighbors stored all of his digital photos on a big desktop computer with multiple hard drives but he didn’t have his images backed up. After an electrical-power surge, the motherboard and all the hard drives were ruined. The neighbor had uploaded images to Facebook, so he thought that not everything was lost.
But Facebook re-sized his images when they were uploaded and during this process either their metadata was removed or Facebook stored this information separately from the images. All my neighbor had left were smaller versions of the images with the same date stamp (representing the day they were downloaded). This meant it wasn’t even possible for him to sort the photos into any chronological order.
If you were a musician, would you consider uploading your MP3 audio files to a site if you knew that the process removed the name of the song, the album it was from, the name of the band and your copyright notice? If you wouldn’t do that, then why should it be different for photos?
At its heart, the Embedded Metadata Manifesto explains that the information users add to digital files is critically important, and once it is added it should not be removed.
I’ve yet to encounter a social media or photo sharing network that warns me, when I upload my images, that the work I’ve done to add captions will be undone. Some will actually pick up the caption information and put that on the page where the photo is shown, so the assumption would be that the other information you added earlier would still be there.
For more information, visit the survey page and see whether the service(s) you use preserve your photo metadata. Don’t forget that with most networks, you can always link to an image on another network when sharing that file. This way you can rest assured that the image will retain its embedded information and your friends can still see your work.
Thank you for this post. Very interesting and very important!
When I talk to photo groups, or to my customers, they are amazed that this information is often stripped. This is one of my pet peeves.
During a recent presentation to the Wasatch Camera Club in Salt Lake City, I mentioned that this was one of my primary concerns about using so called hosting services in the cloud to store images. When one of the attendees mentioned that he used one of the photo sharing sites as a form of backup, I pointed out that the photos are resized automatically on that site, so it is far from a back up. At the most, it MAY be a way to figure out what you have lost, but it will in no way replace the lost content.
I saw light bulbs go off in the room. I expect many in my presentation are shopping for alternative/secondary storage now.
David: Glad to see that others are aware of this issue, and helping others to see the light. There are some social media / photo sharing sites that do preserve the embedded metadata even if the image is resized. Even then, what you are left with is a “derivative” file — rather than the “original.” Some networks will hold a copy of the original file you uploaded, though it may not be the one others see when they view the image you share. To top it all off, online systems are “tweaked” periodically, and afterwards they may work completely differently. That’s why it’s important to test regularly to make sure all is as you expect.
Why is it still a revelation to people that they need to be responsible for their own backups. Backups in the cloud are as good as the promises made in the cloud. Rock solid, until they aren’t. Uploading with periodic testing is one way to backup, but when the testing fails, panic sets in until a new backup is in place.
Redundant backups on hard media – hard drive, SSD, DVD – are still the safest way to go. One or two copies onsite, at least one copy offsite.
Time consuming and inconvenient – perhaps.
Saves the bacon when the computer/cloud melts down – absolutely.
Thank you for such a thoughtful post on the magically disappearing embedded metadata. Perhaps there could be some sort of rating system, such as the one provided by Creative Commons for copyrights, that could be used to quickly identify (and alert users to) what happens with their images’ metadata? Very intriguing and troubling problem considering all that is involved in image and metadata making.
Thanks for sharing such in depth article on this particle subject. I am an active duty AF veteran with a degree in History and currently pursuing my Masters in History. It is ironic that we are discussing a similar topic this week in class about images and pictures in reference to historical research. It is shocking how some of the most famous pictures we know come to know, have been manipulated for political purposes or emotional factors to capture the audience’s attention. I will share this article with my classmates which I know they will enjoy reading. Here is a website you may already be familiar with, but, I thought I would share it. http://www.cs.dartmouth.edu/farid/research/digitaltampering/
Thank you for the article. I am digitalising my personal collection of original antient British and Irish newspapers – which are photographed as part of the process. I’ve had problems with the watermarking being stripped . So I would be interested in any low cost recommendations to embed information in the image
What makes this situation sadder is that this is happening just as we’re trying to get more content producers to think in terms of adding metadata to their works. I can see this evolving into a “what’s the point?” sort of situation if word of this spreads and the social networks continue to ignore the situation.
This story’s author (David Riecks), along with IPTC Managing Director, Michael Steidl, and PLUS Coalition CEO, Jeff Sedlik, will be featured on a live Web discussion panel on this very topic on May 08, 2013. Here’s the signup URL if you’re interested: http://picturepark.com/copyright (It’s free)
The embedded Metadata Manifesto (which I mentioned in the article) recently concluded a summary of the major social media networks. Unfortunately it appears that the link I provided which should have taken you to this page is only going to the main site. The direct link is http://embeddedmetadata.org/social-media-test-results.php — hope that helps.
Thanks for that link. I’m familiar with Dr. Fahid’s work tangentially — through a long time colleague, Kevin Connor (previously with Adobe) — as both now are partners in FourandSix (http://www.fourandsix.com/) a company which deals with image authentication.
I’m not sure if I understand how your watermarks are being removed. Are these “visible” watermarks, or invisible ones (like Digimarc). You can learn more about embedding metadata at http://www.photometadata.org/META-Tutorials and http://www.controlledvocabulary.com/imagedatabases/programs.html — or you can contact me via http://www.photometadata.org/contact with more details.
Also see: http://petapixel.com/2013/03/14/study-looks-into-whether-photo-websites-play-nicely-with-copyright-metadata/
I kind of torn here. I fully understand and appreciate the magnitude of the issue–particularly as it relates to the possible loss of the identification, context, and aesthetic quality of the original image (due to cropping/reformatting), but this is something that the creator/author of the image willingly submits to by uploading his or her images to a social media network like Facebook. It’s a compromise, and buyer beware.
The best approach to this problem is, of course, to back up your images/documents on different drives and keep copies in different physical locations. Additionally, for the purpose of advertising or sharing your work with the masses, create your own website (or have one created with assistance from someone who knows what they’re doing).
In this respect, use of social media networks should serve as informal creative/marketing outlets–serving to inform, but by no means replace either a true(r) online presence (like LOC’s PPOC) or your original digital copies, because, as indicated above, they do not retain the original data, and they were not designed for incorporating or maintaining metadata like Dublin Core.
Thanks for the feedback. A couple of points I think need to be made in clarification.
You say, that “this is something that the creator/author of the image willingly submits to by uploading his or her images to a social media network like Facebook. It’s a compromise, and buyer beware.” First off, no where does facebook inform you that they are resizing your image, and removing the Exif Date/Time stamp, any GPS information (if it is present), and any descriptive information you have added (like the names of the people in the image). In fact, Facebook now recognizes captions embedded in images and will actually extract that and put it on the page. So if that is happening, I think most people would reasonably expect that if the Facebook system is aware and using that info, that it would also be retained in the image that is displayed. Second, can it really be “buyer beware” when you aren’t paying for the service?
The most astute observation I’ve heard on that front run along the line of “if you aren’t paying for the service, YOU are the product.” Read into that as much as you will.
As for your other points, I completely concur. The best option is to have your own backups, and to only store images online where you have control. However, the problem is that many people reading this website, either don’t have the expertise or finances to do that, and no one has told them how risky it is to use these services in lieu of their own archive. That’s why I took the time to point this out, and hope that by doing so, some will take the additional steps to secure their image before anything bad happens.
Thanks for the insights.
A big thank you to the Library of Congress, David Riecks, and the embedded metadata manifesto for brining this issue the public.
A point I want to mention is that stripping metadata from images essentially makes them into orphan works. Metadata acts as contextual agent, documenting the “who, what, where, when, why, and how”, and gives an image identity and authenticity. Once data about the image is removed much of its meaning is lost. This can lead to unfortunate consequences. Under an act recently passed in the UK these orphaned photos could be used by for-profit companies without the consent of the copyright holder(s) (ERR Act). As Riecks mentions, the preservation and findability of these digital assets become problematic.
Also, I wonder how many of these points can be applied to video content posted to social media sites. I know that Vimeo, Youtube and Facebook transcode/re-compress your video. As far as I can tell, any embedded metadata is similarly stripped from those assets.
The International Press Telecommunications Council (IPTC) has updated this survey, and shows that little has changed.
In three years you might have expected at least a little good news…. sigh.
Has the situation gotten any better for 2021?