I am trying to find the extension of a file, given its name as a string. I know I can use the function os.path.splitext but it does not work as expected in c开发者_运维问答ase my file extension is .tar.gz or .tar.bz2 as it gives the extensions as gz and bz2 instead of tar.gz and tar.bz2 respectively.
print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz')group('ext')
>>> gz            # I want this to come as 'tar.gz'
print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.bz2')group('ext')
>>> bz2           # I want this to come 'tar.bz2'
I am using (?P<ext>...) in my pattern matching as I also want to get the extension.  
Please help.
root,ext = os.path.splitext('a.tar.gz')
if ext in ['.gz', '.bz2']:
   ext = os.path.splitext(root)[1] + ext
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
>>> print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
gz
>>> print re.compile(r'^.*?[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
tar.gz
>>>
The ? operator tries to find the minimal match, so instead of .* eating ".tar" as well, .*? finds the minimal match that allows .tar.gz to be matched.
I have idea which is much easier than breaking your head with regex,sometime it might sound stupid too.
name="filename.tar.gz"
extensions=('.tar.gz','.py')
[x for x in extensions if name.endswith(x)]
Starting from phihags answer:
DOUBLE_EXTENSIONS = ['tar.gz','tar.bz2'] # Add extra extensions where desired.
def guess_extension(filename):
    """
    Guess the extension of given filename.
    """
    root,ext = os.path.splitext(filename)
    if any([filename.endswith(x) for x in DOUBLE_EXTENSIONS]):
        root, first_ext = os.path.splitext(root)
        ext = first_ext + ext
    return root, ext
this is simple and works on both single and multiple extensions
In [1]: '/folder/folder/folder/filename.tar.gz'.split('/')[-1].split('.')[0]
Out[1]: 'filename'
In [2]: '/folder/folder/folder/filename.tar'.split('/')[-1].split('.')[0]
Out[2]: 'filename'
In [3]: 'filename.tar.gz'.split('/')[-1].split('.')[0]
Out[3]: 'filename'
Continuing from phihags answer to generic remove all double or triple extensions such as CropQDS275.jpg.aux.xml use while '.' in:
tempfilename, file_extension = os.path.splitext(filename)
while '.' in tempfilename:
     tempfilename, tempfile_extension = os.path.splitext(tempfilename)
     file_extension = tempfile_extension + file_extension
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论