Friday, May 11, 2018

Report writing in forensic multimedia analysis

You've analyzed evidence. You've made a few notes along the way. You've turned those notes over to the process. Your agency doesn't have a specific requirement about what should be in your notes or your report or how detailed they should be. In all the cases that you've worked, you've never been asked for specifics / details.

Now, your case has gone to trial. An attorney is seeking to qualify you to provide expert (opinion) testimony. They introduce you, your qualifications, and what you've been asked to do. The judge may or may not declare you to be an expert so that your opinion can be heard.

As a brief aside, your title or job description can vary widely. I've been an analyst, specialist, director, etc. FRE Rule 702, and the similar rule in your state's evidence code, governs  your testimonial experience. Here's the bottom line: according to evidence code, you're not an "expert" unless the Judge says so, and then only for the duration of your testimony in that case. After you're dismissed, you go back to being an analyst, specialist, etc. You may have specific expertise, and that's great. But the assignment of the title of "expert" as relates to this work is generally done by the judge in a specific case, related to the type of testimony that will be offered.

A technician generally offers testimony about a procedure and the results of the procedure. No opinion is given. "I pushed the button and the DVR produced these files."

An expert generally offers opinion based testimony about the results of an experiment or test. "I've conducted a measurement experiment and in my opinion, the unknown subject in the video at the aforementioned date/time is 6’2” tall, with an error of ..."

Everything's OK ... until it's not. You've been qualified as an expert. Is your report ready for trial? What should be in a report anyway?

First off, there's two types of guidance in answering this question. The first type, people's experiences, might help. But, then again, it might not. Just because someone got away with it, doesn't make it a standard practice. Just because you've been through a few trials doesn't make your way "court qualified." These are marketing gimmicks, not standard practices. The second type, a Standard Practice, comes from a standards body like the ASTM. As opposed to the SWG's, who produce guidelines (it would be nice if you ...), standards producing bodies like the ASTM produce standards (you must/shall). For the discipline of Forensic Multimedia Analysis, there are quite a few standards which govern our work. Here's a few of the more important ones:

  • E860-07. Standard Practice for Examining And Preparing Items That Are Or May Become Involved In Criminal or Civil Litigation
  • E1188-11. Standard Practice for Collection and Preservation of Information and Physical Items by a Technical Investigator
  • E1459-13. Standard Guide for Physical Evidence Labeling and Related Documentation
  • E1492-11. Standard Practice for Receiving, Documenting, Storing, and Retrieving Evidence in a Forensic Science Laboratory
  • E2825-12(17). Standard Guide for Forensic Digital Image Processing

Did your retrieval follow E1188-11? Did your preparation of the evidence items follow E860-07? Did you assign a unique identifier to each evidence item and label it according to E1459-13? Does your workplace handle evidence according to E1492-11? Did your work on the evidence items follow E2825-12?

If you're not even aware of these standards, how will you answer the questions under direct / cross examination?

Taking a slight step back, and adding more complexity, you're engaged in a forensic science discipline. You're doing science. Science has rules and requirements as well. A scientist's report, in general, is structured in the same way. Go search scientific reports and papers in Google Scholar or ProQuest. The contents and structure of the reports you'll find are governed by the accredited institution. I've spent the last 8 years working in the world of experimental science, conducting experiments, testing data, forming conclusions, and writing reports. The structure for my work was found in the school's guidance documentation and enforced by the school's administrative staff.

How do we know we're doing science? Remember the NAS Report? The result of the NAS Report was the creation of the Organization of Scientific Area Committees for Forensic Science about 5 years ago. The OSAC has been hard at work refining guidelines and producing standards. Our discipline falls within the Video / Image Technology and Analysis (VITAL) Subcommittee. In terms of disclosure, I've been involved with the OSAC since it's founding and currently serve as the Video Task Group Chair within VITAL. But, this isn't an official statement by/for them. Of course, it's me (as me) trying to be helpful, as usual. :)

Last year, an OSAC group issued a new definition of forensic science that can be used for all forensic science disciplines. Here it is:

Forensic science is the systematic and coherent study of traces to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context. Source: A Framework to Harmonize Forensic Science Practices and Digital/Multimedia Evidence. OSAC Task Group on Digital/Multimedia Science. 2017

What is a trace? A trace is any modification, subsequently observable, resulting from an event. You walk within the view of a CCTV system, you leave a trace of your presence within that system.

Thus it is that we're engaged in science. Should we not structure our reports in the same way, using the available guidance as to how they should look? Of course. But what would that look like?

Let's assume that your report has a masthead / letterhead with your/your agency's name and contact information. Here's the structure of a report that (properly completed) will conform to the ASTM standards and the world of experimental science.

Administrative Information
     Examiner Information
     Requestor Information
     Unique Evidence Control Number(s)
     Chain of Custody Information
Summary of Request
     Service Requested (e.g. photogrammetry, authentication, change of format, etc.)
Methodology
     Equipment List
     Experimental Design / Proposed Workflow
Limitations / Delimitations
     Delimitations of the Experiment
     Limitations in the Data
     Personnel Delimitations / Limitations
Processing
Amped FIVE Processing Report can be inserted here as it conforms to ASTM 2825-12(17).
Results / Summary
     Problems / Errors Encountered
     Validation
     Conclusions
     List of Output File(s) / Derivatives / Demonstratives
Approval(s)
     Examiner
     Reviewer
     Administrative Approval

It would generally conclude with a declaration and a signature. Something like this, perhaps:

I, __________, declare under penalty of perjury as provided in 28 U.S.C. §1746 that the foregoing is true and correct, that it is made based upon my own personal knowledge, and that I could testify to these facts if called as a witness.

Now, let's talk about the sections.

The Administrative section.

  • You're the examiner. If you have help, or someone helped you in your work, they should be listed too. Co-workers, subcontractors, etc.
  • The requestor is the case agent, investigator, or the client. The person who asked you to do the work.
  • Every item of evidence must have a unique identifier.
  • Every item received must be controlled and it's chain of custody tracked. If others accessed the item, their names would be in the evidence control report / list. DEMS and cloud storage solutions like Evidence.com can easily do this and produce a report.
Summary of Request
  • What was it that you were asked to do, in plain terms. For example, "Given evidence item #XXX, for date/time/camera, I was asked to determine the vehicle's make/model/year" - comparative analysis / content analysis. Or, "Given evidence item #XXX, for date/time/camera, I was asked to estimate the unknown subject's height" - photogrammetry. Or, "Given image evidence item #XXY-X, retrieved from evidence item #XXY (see attached report), I was asked to determine if the image's contextual information had been altered" - authentication.  
  • Provide an abstract of the test and the results - a brief overview of what was done and what the results were (with references to appropriate page numbers). 

Methodology

  • What tools did you use - hardware / software? You may want to include a statement as to each and their purpose / fitness for that purpose. As an example, I use Amped Five. Amped Five is fit for the purpose of conducting image science experiments as it is operationalized from peer-reviewed / published image science. It's processing reports include the source documentation. 
  • Your proposed workflow. What will guide your work? Can you document it easily? Does your processing report follow this methodology? Hint, it should. Here's my workflow for Photogrammetry, Content Analysis, and Comparative Analysis. You can find it originally in my book, Forensic Photoshop. It's what I use when I work as an analyst. It's what I teach.


Limitations / Delimitations

  • Delimitations are the bounds within which your work will be conducted. I will test the image. I won't test the device that created the image.
  • With DME, there are a ton of limitations in the data. If the tested question is, what is license plate, and a macro block analysis determines that there is no original data in the area of the license plate, then that is a limitation. If the tested question is, what is the speed of the vehicle, and you don't have access to the DVR, then that is a huge limitation. Limitations must be stated.
  • Personnel issues should also be listed. Did someone else start the work that you completed? Was another person employed on the case for a specific reason? Did something limit their involvement? If the question involves the need to measure camera height at a scene, and you can't climb a ladder so you mitigated that in some way, list it. 
A side note here ... did you reach out to someone for help? Someone like the DVR's technician or the manufacturer of your analysis tool's support staff? Did they assist you? Make sure that you list their involvement. Did you send out a copy of the evidence to someone? If yes, is it within your agency's policy to release a copy of the evidence in the way that you've done so for the case? As an example, you send a still image of a vehicle to the FVA list asking for help. You receive a ton of advice that helps you form your conclusion, or helps the investigation. Did you note in your report that you interacted with the list and who helped? Did you provide a copy of the correspondence in the report package? Did you provide all of the responses or just the ones that support your conclusion? The ones that don't support your eventual conclusion should be included, with an explanation as to why you disagree. They're potentially exculpatory, and they should be addressed.

Remember, on cross examination, attorneys rarely ask questions of people blindly. They likely already know the answer and are walking your down a very specific path to a very specific desired conclusion.  Whilst an attorney might not subpoena Verint's tech support staff / communications, as an example, they may have access to the FVA list and may be aware of your communications about the case there. You may not have listed that you received help from that source, but the opposing counsel might. You won't know who's watching what source. They may ask if you've received help on the case. How would you answer if you didn't list the help and disclose the communications, all of the communications? If your agency's policy prohibits the release of case related info, and you shared case related info on the FVA list, your answer to the question now involves specific jeopardy for your career. I've been assigned to Internal Affairs, I've been an employee rep, I know how the system works when one has been accused of misconduct. How do you avoid the jeopardy? Follow your agency's policies and keep good records of your case activity.

Processing

  • These are the steps performed and the settings used. This section should read like a recipe so that some other person with similar training / equipment can reproduce your work. This is the essence of Section 4 of ASTM 2825. Amped FIVE Processing Report can be inserted here as it conforms to ASTM 2825-12(17). 

Results / Summary

  • Did you encounter any problems or errors. List them.
  • How did you validate your results? Did anyone peer review your work? This can include test/retest or other such validity exams.
  • Conclusions - your opinion goes here. This is the result of your test / experiment / analysis.
  • List of Output File(s) / Derivatives / Demonstratives

Approval(s)

  • Examiner (your name here), along with anyone else who's work is included in the report.
  • Reviewer(s) - was your completed work reviewed? Their name(s).
  • Administrative Approval - did a supervisor approve of the completed exam?
Do your reports look like this? Does the opposing counsel analyst's report look like this? If not, why not? It may be an avenue to explore on cross examination. It's best to be prepared.


I know that this is a rather long post. But, I wanted to be rather comprehensive in presenting the topic and list the sources for the information listed. Hopefully, this proves helpful.

Enjoy.

Friday, December 15, 2017

Sample sizes and determinations of match

It's been a busy fall season, traveling the country and training a whole bunch of folks. Over a lunch, the group I was with asked me about a case that's been in the news and wondered if we'd be discussing how to conduct a comparison of headlight spread patterns. That lead us down quite the rabbit hole ...

Comparative analysis assumes a "known" and compares it to an "unknown." It's important to consider time & temporality - that one can only TEST in the present - in the "now." For the future / past, one can only PREDICT what happened "then." Testing and Prediction have their own rules.

Take the testing of an image / video of a headlight spread pattern. One attempts to compare the "known" vs. a range of possible "unknowns." Our lunch group mentioned a case where the examiner tested about a dozen trucks in front of the CCTV system that generated the evidentiary video in addition to the vehicle in evidence to try to make a determination. The examiner did in fact determine match, as the report indicated.

The question really isn't the appropriateness of the visual comparison. The question is the appropriateness of the sample size such that the results can be useful / trusted. How did the examiner determine an appropriate sample size? Is a dozen trucks appropriate?

Individual head light direction can be adjusted. Headlights come in pairs. Thus, there are two variables that are not on/off. In the world of statistics, they're continuous variables. You're testing two continuous variables against a population of continuous variables to determine uniqueness. Is this possible in real life? What's the appropriate sample size for such a test?

I use a tool called G*Power to calculate sample size. Just about every PhD student does. It's free and quite easy to use once you learn to speak it's language. Most, like me, learn it's language in graduate level stats classes.

For example, if you've determined that an F-Test of Variance - Test of Equality is the appropriate statistical test needed to conduct your experiment, then select that test using G*Power.



Press the Calculate button, and G*Power calculates the appropriate sample size. In this case, the appropriate sample size is 266. There's a huge difference between 266 and a dozen. You can plot the results to track the increase in sample size relative to Power. If you want greater confidence in your results (Power), you need a larger sample size.

The examiner's report should include a section about how the sample size was created and why the test used to calculate it was appropriate. It should have graphics like those below to illustrate the results.


It's vitally important that when conducting a comparative exam and declaring a "match", that the examiner understands the necessary science behind that conclusion. "Match" usually does not mean "a Nissan Sentra." That's not helpful given the quantity of Nissan Sentras a given region. "Match" means "this specific Nissan Sentra." Isn't the standard, "Of all the Nissan Sentras made in that model year whithersoever dispersed around the globe, it's only this particular one and no other?"

What about the test? Did you choose the appropriate test?

What if, on the other hand, you determined that the appropriate test is a T-test like Wilson's sign-ranked test, then the sample size would be different. With that test, the appropriate sample size would be 47. That's still not a dozen.


What happens if you like the T-test and opposing counsel's examiner likes the F-test? What happens when two examiners disagree? Do you have the education, training, and confidence to defend your choice and your results in a Daubert hearing?

Perhaps you've been trained in the basics of conducting a comparative examination. But have you been trained / educated in the science of conducting experiments? Do you know how to choose the appropriate tests for your questions? Do you know how to structure your experiment? Do you know how to calculate the appropriate sample size for your tests?

To wrap up, when concluding that a particular vehicle can't be any other because you've compared the head light spread pattern in a video to several vehicles of the same model / year, it's vitally important to justify the sample size of comparators. You must choose the appropriate test and calculate the sample size based on that test. ASTM 2825-12's requirement that one must produce a report such that another similarly trained / equipped person can reproduce your work means that you must include your notes on the calculation of the sample size. If you haven't done this, you're just guessing and hoping for the best.

Friday, October 27, 2017

Forensically sound

"Is it forensically sound?"

I've heard this question asked many times since I began working in forensic analysis many years ago. Me being me, I wanted to know what it meant to be "forensically sound." Here's what I found as I took a journey through the English language.

"Forensically." The root of this is "forensic."


The root language for the English word "forensic" is the Latin "forensis." It means a public space, a market, or in open court.


Forensis means "of or pertaining to the market or forum." Another way of looking at this can be, activities that happen in the market, forum, public space, or open court.

Ok. We've got "forensically" down. What about "sound."



Sound, from the Old English, means that which is based on reason, sense, or judgement, and/or that which is competent, reliable, or holding acceptable views.

Put together, and given the context of our work, "forensically sound" can mean that activity, related to work for the court / public, which is well founded, reliable, and logical - which is based on reason, sense, or good judgement.

Great, we've now got a working definition. Now how does it apply to our efforts?

In the US, the Judge acts as the "gatekeeper." In providing this "gatekeeper" function, the Judge should weigh the foundation and reliability of the evidence being submitted in the particular case. When questions arise as to science, validity, and/or reliability, either party can ask the Judge for a hearing on the evidence and explore these issues (i.e. Daubert Hearing).

One of the ways that Judges evaluate the work is by comparing the work product to known standards. In our discipline,  we can find standards at the ASTM. For image / video processing, the standard is ASTM 2825. Taking a step back, standards are "must do" and guidelines are "may do."

Thus, if you've followed ASTM 2825 (meaning your work can be repeated), and you use valid and reliable tools, your work is "forensically sound." It's a two part evaluation - you and your tools.

Did you work in a valid, reliable, repeatable, and reproducible way? Are your tools valid and reliable? If the answer is yes to all of these, they your work is forensically sound.

In the many times that I've been asked to evaluate another person's work (i.e. from opposing counsel), this is the standard with which I work. It forms a checklist of sorts.

  • Do I have the same evidence as the opposing side? (i.e. true/exact copy)
  • Is there a written report that conforms to ASTM 2825-12? This assures that I can attempt the tests and thus attempt to reproduce their results. 
That's really it for me. Others may concentrate on training and education and certifications. I really don't. If they aren't trained / educated, it will show in their reporting. To be sure, there are avenues to explore if you have the other person's CV (verify memberships, certifications, education etc.). But, I would hope that folks wouldn't embellish their CV. It's so easy to fact check these days, why lie about something that can be easily discovered via Google?

You have a copy of the evidence and the opposing counsel's report. You attempt to reproduce the results. Two things can happen.
  1. You successfully reproduce the results and come to the same conclusion.
  2. Your results differ from that of the opposing counsel.
If the answer is #1, you're finished. Complete your report and move on. If the answer is #2, can you try to figure out the errors? Your report may include your conclusions as to what went wrong on the other side and why yours is the correct answer. 

I hope this helps...

Wednesday, September 20, 2017

What you know vs. what you can prove

I had an interesting evening. A friend sent a link to a YouTube video, a recorded webinar for a "video analysis" product. I'll admit. I was curious. I watched it. Below is my commentary on what I saw.

The presenter outlined his workflow for working with video from a few different sources. The presentation turned to the difference between how Direct Show handles video playback vs. what's actually in the data stream. The presenter showed how Direct Show may not give you the best version of the data to work with. If you've been around LEVA for a while, you likely know this already. It's good information to know.

Then the presenter did a comparison of a corrected frame from the data stream vs. a frame from the video as played via the Direct Show codec - in Photoshop. He made an interesting statement that prompted me to write this post. He was clicking between the layers so that viewers could "see" that there was a difference between the two frames. The implication was that the viewer could clearly "see" the difference. He was making the point that one frame had more information / a more clear display of the information - illustrating it visually (demonstratively).

This got me thinking - does he know how people "see?"


I'll get to the difference between his workflow (using many tools) and Five (using one tool) in a bit. But first, I want to address his point about "clearly seeing."

I do not hide the fact that I am autistic. Thanks to the changes in the DSM, my original brain wiring diagnoses that were made during the reign of DSM IV now place me firmly on the autism spectrum in DSM V. Sensory Processing Disorder and Prosopagnosia have made life rather interesting; especially growing up in a time when doctors didn't understand these wiring issues at all. As an analyst, they present challenges. But they also present opportunities. Can a person doing manual facial comparison be accused of bias if they're face blind? Not sure. Never been asked. But I digress.

I've spent my academic life studying the sensory environment. My PhD dissertation focusses on the college sensory environment that is so hostile, autistic college students would rather drop out than stick it out. But again, I've studied and written extensively on the issue of what people perceive, so the presenter's statement struck me.

It also struck me from the standpoint of what we "know" vs. what we can prove.

The presenter took viewers on quite a tour of a hex editor, GraphStudio, his preferred "workflow engine," and Photoshop before making the statement that prompted this post. A lot of moving parts. Along the way, the story of how information is displayed and why it's important to "know" where differences can occur was driven home.

Yes, we can all agree that there are differences between how Direct Show shows things and how a raw look at the video shows things. It may be helpful to "see" this. But what if you don't perceive the world in the same way as the story teller.

Might there be another way to perform this experiment that doesn't rely on the viewer's perception matching that of the presenter?

Thankfully, with FIVE, there is.

The presenter started with Walmart (SN40) video being displayed via Direct Show. So, I'll start there too. SN40, via Direct Show, displays as 4 CIF.


Then, I used FIVE's conversion engine to copy the stream data into a fresh container.
It displays as 2 CIF.


I selected the same frame in each video and bookmarked them for export.



I brought these images back into FIVE for analysis.

The issue with 2 CIF is that, in general, every other line of resolution isn't actually recorded and needs to be restored via some valid and reliable process. FIVE's Line Doubling filter allows me to restore these lines. I can choose the interpolation method during this process. The presenter in the video chose a linear interpolation method to restore the lines (in Photoshop - not his "workflow engine"), so I did the same.


I've now restored the stream copy frame. I wanted not only to "see" the difference between frames ("what I know"), I wanted to compute the difference between frames ("what I can prove").

Again, staying in FIVE, I linked the the Direct Show frame with the Stream Copy frame with the Video Mixer (found in the Link filter group).


The filter settings for the Video Mixer contains three tabs. The first tab (Inputs) allows the user to choose which processing chains to link, and at what step in the chain.


The second tab (Blend) allows the user to choose what is done with these inputs. In our case, I want to Overlay the two images.


The third tab (Similarity) is where we transition from the demonstrative to the quantitative. Unlike Photoshop's Difference Blend Mode, FIVE doesn't just display a difference (is there a threshold where difference is present but not displayed by your monitor?) it computes similarity metrics.


With the Similarity Metrics enabled, FIVE computes the Sum of Absolute Difference (SAD), the Peak Signal to Noise Ratio (PSNR), Mean Structural Similarity, and the Correlation Coefficient. The actual difference, computed 4 different ways. You don't just "see" or "know" - you prove.


The reporting of this is done at the click of the mouse. FIVE has been keeping track of your processing and the results are complied and produced on demand - no typing your report. (My arthritic fingers thank the programmers each day.)


Reports in the PDF/a standard mean the greatest compatibility when dealing with customers. Click on the hyperlink on the report and read the explanation of what was done, the settings, and the academic/scientific source for the test. This means that FIVE's reports are fully compliant with ASTM's 2825-12. Are Photoshop's reports complaint? What about your "workflow engine? Hint, they are if you type them in such a way as to assure compliance. Who has time for that?

Total time for this experiment was under 5 minutes. I'm sure the presenter could have been faster than was displayed in the webinar, he was explaining things. But, he used a basket of tools - some free and some not free. He also didn't take the viewers time to compile an ASTM 2825-12 compliant report. Given the many tools used, I'm not sure how long that takes him to do.

When considering his proposed workflow, you need to consider the total cost of ownership of the whole basket as well as the cost of training on those tools. You also can factor how much time is spent/saved doing common tasks. I've noted before that prior to having FIVE, I could do about 6 cases per day. With FIVE, I could do about 30 per day. Given the amount of work in Los Angeles, this was huge.

For my test, I used one tool - Amped FIVE. I could do everything the presenter was doing in one tool, and more. I could move from the demonstrative to the quantitative - in the same tool.

Now, to be fair, I am retired now from police service and work full time for Amped Software. OK. But, the information presented here is reproducible. If you have FIVE and some Walmart video, you can do the same thing in this one amazing tool. Because I come from LE, I am always evaluating tools and workflows in terms of total cost of ownership. Money for training and tech is often hard to come by in government service and one wants the best value for the money spent. By this metric, Amped's tools and training offer the best value.

If you want more information about Amped's tools or training, head over to the web site.


Saturday, September 2, 2017

Changing times

I've been in the "video forensics" business for quite some time now. I've seen enough to notice trends in the industry. I've seen people come and go. Today, I want to comment on a coming trend that I believe will impact everyone in the business, LEOs and privateers alike.

Here's what I mean.

Going back to about 2006, the economy was booming and folks were happy. Then 2007 hit and the economy tanked. As belts tightened, people cut back on entertainment and other non-essential things. A result of this was major cut-backs in the movie business. Many out of work editors and producers entered the business of video forensics. They guessed that because of their knowledge of the tools - Avid MC, PremierePro, Final Cut, etc - they could go out there and compete for work, offering their services and "expertise" in video to the courts, attorneys, PIs, and the like. There were few success stories and a lot of colossal fails. Very few of these folks are still around.

Another trend is emerging.

In the push to assure future success, parents have been steering their kids to STEM degrees. Many have pursued and achieved doctorates in the STEM fields only to find that there is a glut of people on the market with such degrees (in my academic field, there's about a 600/1 ratio of applicants to jobs/grants). Some are leaving their degree field, using their expertise in experimental design and statistics (gained by every PhD) in a variety of useful ways (Think Moneyball).

A case* from Arizona last year serves as the canary in the video forensics coal mine. It's a firearms case, but all the issues can easily be applied to our field. In State v Romero (2016), the Arizona Supreme Court said that the trial court erred in not allowing the defense to call their "expert." The person in question wasn't a firearms examiner or a tool-mark examiner. He is an expert in Experimental Design, with a PhD in the discipline.

Here's some relevant parts of the ruling:

"...Dr. Haber was not offered to testify whether Powell had correctly analyzed the toolmarks on the shell casings. Instead, Dr. Haber, based on his expertise in the broader field of experimental design, criticized the scientific reliability of drawing conclusions by comparing tool marks."

"...Arizona Rule of Evidence 702 allows an expert witness to testify if, among other things, the witness is qualified and the expert’s “scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence . . . .” Trial courts serve as the “gatekeepers” of admissibility for expert testimony, with the aim of ensuring such testimony is reliable and helpful to the jury."

Hint, every state court and the US federal courts have a similar rule governing expert witnesses and their testimony.

"... The trial court here concluded that Dr. Haber was not qualified to testify as an expert in firearms identification. In affirming, the court of appeals noted that Dr. Haber, although having reviewed the literature on firearms identification, had not previously been retained as an expert on firearms identification, conducted a toolmark analysis, attempted to identify different firearms, or conducted research on firearms identification. 236 Ariz. at 458 ¶¶ 23-25, 341 P.3d at 500."

"... The issue, however, is not whether Dr. Haber was qualified as an expert in firearms identification, but instead whether he was qualified in the area of his proffered testimony — experimental design. Here, the trial court determined that Powell was qualified to offer an expert opinion that the shell casings were all fired from the same Glock. But Romero did not offer Dr. Haber as an expert in firearms identification to challenge whether Powell had correctly performed his analysis or formed his opinions. Instead, Dr. Haber’s testimony was proffered to help the jury understand how the methods used by firearms examiners in performing toolmark analysis differ from the scientific methods generally employed in designing experiments."

Did you catch that? Dr. Haber was retained to challenge the validity of the method used in the prosecution's examination - to illustrate "... how the methods used by firearms examiners in performing toolmark analysis differ from the scientific methods generally employed in designing experiments."

"... Under Rule 702, when one party offers an expert in a particular field (here, the State’s presentation of Powell as an expert in firearms identification) the opposing party is not restricted to challenging that expert by offering an expert from the same field or with the same qualifications. The trial court should not assess whether the opposing party’s expert is as qualified as — or more convincing than — the other expert. Instead, the court should consider whether the proffered expert is qualified and will offer reliable testimony that is helpful to the jury.  Cf. Bernstein, 237 Ariz. at 230 ¶ 18, 349 P.3d at 204 (noting that when the reliability of an expert’s opinion is a close question, the court should allow the jury to exercise its fact-finding function in assessing the weight and credibility of the evidence)."

"... The gist of Dr. Haber’s proffered testimony was that the methods generally used in conventional toolmark analysis fall short of scientific standards for experimental design. Dr. Haber’s testimony was therefore directed at the scientific weight that should be placed on the results of Powell’s tests. Such questions of weight are emphatically the province of the jury to determine. E.g., State v. Lehr, 201 Ariz. 509, 517 ¶¶ 24–29, 38 P.3d 1172, 1180 (2002). "

"... Apart from Dr. Haber’s qualifications, his testimony would not have been admissible unless it would have been helpful to the jury in understanding the evidence. Ariz. R. Evid. 702(a). The State presented Powell’s testimony that the indentations on shell casings demonstrated that the Glock had fired all the shells, including those at the murder scene, and the State argued that the toolmark comparisons demonstrated a match to “a reasonable degree of scientific certainty.” Dr. Haber’s testimony would have been helpful to the jury in understanding how the toolmark analysis differed from general scientific methods and in evaluating the accuracy of Powell’s conclusions regarding “scientific certainty.”"

"... The thrust of Dr. Haber’s testimony was that the methods underlying toolmark analysis (here comparing indentations and other marks on shell casings) are not based on the scientific method, but instead reflect subjective determinations by the examiner conducting the analysis. Haber would have explained that unlike experts who use other forms of forensic analysis rooted in the scientific method, firearms examiners do not follow an accepted sequential method for evaluating characteristics of fired shell casings and comparing them to control subjects. By describing the methods used by toolmark examiners, Dr. Haber’s testimony could have helped the jury assess how much weight to place on Powell’s “scientific” conclusion that the shell casings at the murder scene could only have been fired from the Glock found by the police when they stopped Romero." How big was the sample size in your experiment? How did you determine the appropriateness of that size? How did the casing's markings compare to a normal distribution of values derived from the sample / control subjects?

"... One of his critiques of the methodology used by firearms examiners is that they do not employ identifiable, standardized protocols." Show me the peer-reviewed, published source that describes the method used.

"... Dr. Haber’s testimony was intended to highlight that the conclusions drawn by firearms examiners from toolmarks do not result from the application of articulable standards and lack typical safeguards of the scientific method such as independent verification by other examiners. Thus, Dr. Haber’s testimony could have helped the jury to understand any eficiencies in the experimental design of toolmark analysis and to assess any suggestion that such analysis was “scientific.” Cf. Salazar-Mercado, 234 Ariz. at 594 ¶ 15, 325 P.3d at 1000 …" Who checked your work and signed-off on it? 

So why such a long post? I saw a video over on Deutsche Welle called "Crime fighting with video forensics." In it, the featured person made this statement: “each vehicle has a unique headlight spread pattern." Does it now? How does he know this? Did he conduct a study? Where is it published? Has he every been asked to prove out his methodology? What was the sample size of the experiment? How was the appropriate size for the sample calculated? How would his "headlight spread pattern" methodology stand up to cross-examination by an attorney prepared by someone with knowledge of experimental design? Remember, there are a lot of out-of-work PhDs out there? What would happen if Dr. Haber was the opposing expert in your case?

The Reddit Bureau of Investigation tackles the subject here. A link from that page contains the following quote, "... all the things your describing sound almost.... Imperfect? I mean, it scares me to think I might get pinned for a crime because I have a similar headlight spread as someone else … So what I'm asking is, are techniques like headlight spread and clothing identification taken very seriously in court? ..." According the the DW story, the matching of the "headlight spread pattern" did lead to a conviction in the highlighted case. The posts are about 5 years old. Plenty of time for someone to actually test this method and publish results - not just post questions on Reddit. But, I can't find any studies in the academic repositories.

Now, I may seem to be picking on one person. I'm not. I'm picking on the use of techniques that are called "science" but have no foundation in any science or the scientific method. I found police-led training on the subject with a simple Google search. Well-meaning folks will be exposed to this topic and begin to use it in their investigations - perhaps unaware of the challenges to it's validity that they may face if/when they testify as to their work.

Errors in conclusions and the use of untested methodologies threaten forensic science. It's not me saying this, it's the focus of the NAS Report. It's the reason the OSAC was created. If you're in the "video forensics" discipline, and you're giving your OPINION about something related to the evidence, PLEASE be sure that your opinion is grounded in valid and reliable science - science that you can quote when asked. For example, if you're using the Rule of Thirds to calculate the height of an unknown subject / object in a CCTV video, you will have problems under a capable cross examination. Where in academics / science can you find a paper that tells you how to employ this method for this purpose? Hint, you can't. If you're using Single View Metrology in your measurements, you'll easily find the source document for this technique as well as the many papers that cite this technique.

And this is where the weakness in many "analysts" work can be found. When giving your opinion, what is the source of your conclusion? Which paper? Which study? How about simply listing your references / sources in your report so there's no confusion as to the basis of your opinions?

My entry into grad school opened my eyes as to what I didn't know and what the various trade groups where I'd received my training couldn't prepare me for. My pathway to my dissertation had me laser focussed on stats, experimental design, sample sizes, validity, and defending my work in front of people who have gone down a similar path and know way more than me. It's humbling to defend one's work - to be cross-examined by such brilliant people. But, iron sharpens iron. I'm the better for it.

Rather than tell you, it'll be OK, I'm saying watch out. You're heading down an unsustainable path. If folks want to continue to use this method - "headlight spread pattern analysis," probability says that there's going to be a challenge. Do you want that to be you? Are you prepared for it?

Something to think about ...

*I'm not an attorney. This is not legal advice. This is not about one person or one case, but the use of untested / un-scientific techniques. Check your six. Relax. Breathe. Love.