XMLLINT - Remove XML Tags

simon_brooke
New Contributor III

Hi All,

I am using a script to pull data from the API. It works 100%, but I need to remove the XML tags from the data as the results get sent over for GDPR requests.

I am using the following at the end of the script.

xmllint --format -

Thanks
Simon

7 REPLIES 7

mm2270
Legendary Contributor II

Hmm, I'm not exactly sure what you mean by remove the XML tags. The xmllint --format - command simply formats the input out into a valid XML formatted structure. That command isn't designed to strip the XML tags. Maybe you can explain in a little more detail about what type of output you're looking for? For example, if you just want the text that's between the open/close XML tags, you may want to explore using xpath as an option. You can also get them out of the xmllint format using awk or some other tools if necessary.
I can probably provide some examples of how to do the above, but it would be helpful to have a better understanding of what you're expecting to get from the API output.

simon_brooke
New Contributor III

Hi,

Looks like you are right with xpath, i just wanted to export the actual data instead of the xml tags with the data.

Here is an example of what I am using

!/bin/bash

apiuser=""
password="
"
emailaddress=""
curl -sku "$apiuser:$password" -H "accept: text/xml" https://
.jamfcloud.com/JSSResource/users/email/$emailaddress | xmllint --format -

Tribruin
Contributor III
Contributor III

Is there specific fields that you need to need to pull from the result? As @mm2270 mention, you can use xpath to pull individual fields. For example if you need the full_name, pipe your result in to the following:

xmllint --xpath '/users/user/full_name/text()' -

You just need to know the full path to the field and make sure to add the text() to strip the xml tags.

simon_brooke
New Contributor III

Thanks for the response, is there anyway to add extra fields to the results. EG to add email_address to the results

sdagley
Honored Contributor II

@simon.brooke There's a couple of ways you could deal with adding extra fields to your output:

1) Use a variable to store the result of the API call, then do repeated xpath text() queries to get the fields you want
2) Use xslt to parse the result of the API call and extract all of the fields you want in one shot. See JAMF API - Export Serial Number and Computer Name to CSV file for an example.

mm2270
Legendary Contributor II

xpath only allows you (as far as I know) to pull one field at a time. I wish there was an easy way to extract multiple fields in the same path at once, and if some xpath guru here knows how to do that, I'd love to know it. But I haven't been able to get it to do that.

One way I've done this, pulling out multiple fields at once, is with xmllint and awk with a field separator. Try something like this:

curl -sku "$apiuser:$password" -H "accept: text/xml" https://.jamfcloud.com/JSSResource/users/email/$emailaddress | xmllint --format - | awk -F'>|<' '/full_name|email_address/{print $3}'

A little explanation on the above. awk can use field separators to help you get specific sections of a string and also has a regex type of matching, so it can operate on only the lines you direct it to. So the awk -F'>|<' part tells awk to use > and < as the field separator(s), meaning it should try to find strings between those 2 characters. Since all our data from the API call is contained in between some kind of xml tags that have those characters before and after the actual text, this generally works well. The second half of the awk part is '/full_name|email_address/{print $3}', which tells it to only match lines that have full_name and email_address. The vertical pipe between them instructs it to find both if possible. The {print $3} is printing "field 3", which in this case happens to line up with the text between those tags (> and <). In case you're wondering how to know it's field 3, that just takes a little experimenting, since it's not immediately obvious.

Anyway, give that a shot and see if it gets you what you want. And obviously you can add additional lines to look for, like phone_number or position. But note that you'll run into issues when using a more generic tag like name since that shows up on many lines, not just one. For those, it would be better to use xpath to drill down to the specific "name" item you want to pull out.

simon_brooke
New Contributor III

Hi all,

Thanks for the response, I was able to use the ()text to take away the xml tags.
All I had to do was run another curl command to pull the rest of the data.